Deep Learning Engineer, Datacenters

4 Months ago • 3 Years + • Research Development

Job Summary

Job Description

NVIDIA's Deep Learning Engineer in Datacenters will help develop software infrastructure to analyze deep learning applications, evolve cost-efficient datacenter architectures for LLMs, and work with experts to develop analysis and profiling tools in Python, bash, and C++. Responsibilities involve analyzing system and software characteristics of DL applications, developing analysis tools, and measuring key performance metrics to estimate efficiency improvements. The role requires collaboration with various teams across NVIDIA, from research to silicon architecture. The ideal candidate will have experience with system software, GPU kernels, or DL frameworks and a strong understanding of system architecture and performance.
Must have:
  • Bachelor's degree in EE/CS (Master's/PhD preferred)
  • 3+ years relevant experience
  • System software/Silicon architecture experience
  • C/C++ and Python programming
  • Deep Learning application analysis
Good to have:
  • CUDA, PyTorch, TensorFlow
  • Containerization (Docker), Slurm
  • Performance monitoring tools (perf, gprof)
  • Performance modeling (CPU, GPU, Memory, Network)
  • Multi-site/functional team experience

Job Details

As NVIDIA makes inroads into the Datacenter business, our team plays a central role in getting the most out of our exponentially growing datacenter deployments as well as establishing a data-driven approach to hardware design and system software development. We collaborate with a broad cross section of teams at NVIDIA ranging from DL research teams to CUDA Kernel and DL Framework development teams, to Silicon Architecture Teams. As our team grows, and as we seek to identify and take advantage of long term opportunities, our skillset needs are expanding as well.

Do you want to influence the development of high-performance Datacenters designed for the future of AI? Do you have an interest in system architecture and performance? In this role you will find how CPU, GPU, networking, and IO relate to deep learning (DL) architectures for Natural Language Processing, Computer Vision, Autonomous Driving and other technologies. Come join our team, and bring your interests to help us optimize our next generation systems and Deep Learning Software Stack.

What you'll be doing:

  • Help develop software infrastructure to characterize and analyze a broad range Deep Learning applications
  • Evolve cost-efficient datacenter architectures tailored to meet the needs of Large Language Models (LLMs).
  • Work with experts to help develop analysis and profiling tools in Python, bash and C++ to measure key performance metrics of DL workloads running on Nvidia systems.
  • Analyze system and software characteristics of DL applications.
  • Develop analysis tools and methodologies to measure key performance metrics and to estimate potential for efficiency improvement.

What we need to see:

  • A Bachelor’s degree in Electrical Engineering or Computer Science with 3 years or more of relevant experience (Masters or PhD degree preferred)
  • Experience in at least one of the following:
    • System Software: Operating Systems (Linux), Compilers, GPU kernels (CUDA), DL Frameworks (PyTorch, TensorFlow).
    • Silicon Architecture and Performance Modeling/Analysis: CPU, GPU, Memory or Network Architecture
  • Experience programming in C/C++ and Python. Exposure to Containerization Platforms (docker) and Datacenter Workload Managers (slurm) is a plus
  • Demonstrated ability to work in virtual environments, and a strong drive to own tasks from beginning to end. Prior experience with such environments will make you stand out.

Ways to stand out from the crowd:

  • Background with system software, Operating system intrinsics, GPU kernels (CUDA), or DL Frameworks (PyTorch, TensorFlow).

  • Experience with silicon performance monitoring or profiling tools (e.g. perf, gprof, nvidia-smi, dcgm).

  • In depth performance modeling experience in any one of CPU, GPU, Memory or Network Architecture

  • Exposure to Containerization Platforms (docker) and Datacenter Workload Managers (slurm).

  • Prior experience with multi-site teams or multi-functional teams.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative and autonomous, we want to hear from you!

#LI-Hybrid

Similar Jobs

Matic Robots - Senior Global Supply Manager

Matic Robots

Mountain View, California, United States (On-Site)
5 Months ago
Nintendo - Software Engineer (NTD)

Nintendo

Redmond, Washington, United States (On-Site)
1 Year ago
Marvell - Senior Manager, HR Communications

Marvell

Santa Clara, California, United States (On-Site)
1 Month ago
InMobiInMobi - Senior Manager Demand Partnerships

InMobiInMobi

London, England, United Kingdom (On-Site)
2 Months ago
Blazing Griffin - Technical Artist (Games)

Blazing Griffin

Glasgow, Scotland, United Kingdom (Hybrid)
7 Months ago
Apple - Machine Learning Resource Management Engineer - SIML

Apple

Seattle, Washington, United States (On-Site)
2 Weeks ago
Bosch Group - AI Research Scientist – GenAI

Bosch Group

Sunnyvale, California, United States (On-Site)
1 Month ago
Google - Staff Software Engineer, AI/ML Recommendations, Rankings, Predictions, YouTube

Google

San Bruno, California, United States (On-Site)
3 Months ago
DevRev - Applied AI Engineer

DevRev

Chennai, Tamil Nadu, India (On-Site)
2 Months ago
zoox - Machine Learning Engineer - Perception Offline Driving Intelligence

zoox

Foster City, California, United States (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

rivos - Deep Learning Libraries Engineer

rivos

Santa Clara, California, United States (Hybrid)
1 Year ago
Shipt - Digital Merchandiser

Shipt

Birmingham, Alabama, United States (Hybrid)
2 Weeks ago
Mercury - Risk User Experience Manager

Mercury

San Francisco, California, United States (Remote)
2 Months ago
kaizen gaming  - Principal Frontend Engineer (Angular)

kaizen gaming

Athens, Greece (Hybrid)
1 Month ago
Lilt - Regional Director

Lilt

London, England, United Kingdom (Remote)
6 Days ago
Zscaler - Senior Partner Marketing Manager, Public Sector

Zscaler

United States (Remote)
2 Months ago
ZeniMax Media - Global Security Investigations & Executive Protection Sr. Program Manager

ZeniMax Media

Rockville, Maryland, United States (On-Site)
2 Months ago
Like Card - Head of Merchant Success

Like Card

Cairo, Cairo Governorate, Egypt (On-Site)
1 Week ago
DraftKings - Customer Trading Analyst

DraftKings

Plovdiv, Plovdiv Province, Bulgaria (On-Site)
2 Weeks ago
JDA - Sr Manager (HRBP)

JDA

Bengaluru, Karnataka, India (On-Site)
6 Days ago

Get notifed when new similar jobs are uploaded

Jobs in Bengaluru, Karnataka, India

Qube Cinema - AI Workflow Lead – Localization & Accessibility

Qube Cinema

Chennai, Tamil Nadu, India (On-Site)
2 Months ago
Trek - Senior Software Engineer

Trek

Haryana, India (Hybrid)
1 Week ago
oportun - Staff Software Engineer

oportun

Gurugram, Haryana, India (Remote)
1 Week ago
caliogo - JIRA Administrator

caliogo

Hyderabad, Telangana, India (On-Site)
4 Months ago
Valeo - Lead - SAP FICO

Valeo

Chennai, Tamil Nadu, India (On-Site)
6 Months ago
Rackspace Technology - Azure Cloud Architect

Rackspace Technology

Gurugram, Haryana, India (Remote)
1 Month ago
Veeam Software - Backend Engineer

Veeam Software

Pune, Maharashtra, India (Hybrid)
2 Months ago
Stage - Promo Editor

Stage

Noida, Uttar Pradesh, India (On-Site)
1 Month ago
Wind River - Senior Engineer - Cloud

Wind River

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Liquid Asia - Senior Manager - Brand Management

Liquid Asia

Mumbai, Maharashtra, India (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Research Development Jobs

Apple - Software Engineer, Machine Learning

Apple

Cupertino, California, United States (On-Site)
2 Months ago
Capgemini - Kore.Ai Developer/Consultant/Lead/Architect

Capgemini

India (On-Site)
1 Month ago
Apple - Machine Learning Engineer - Apple Vision Pro

Apple

Sunnyvale, California, United States (On-Site)
1 Month ago
Microsoft - Member of Technical Staff, AI Pretraining Platform

Microsoft

London, England, United Kingdom (On-Site)
3 Months ago
Postman - Senior Backend Engineer - AI / LLM Code

Postman

San Francisco, California, United States (Hybrid)
3 Months ago
QuinStreet - Applied Machine Learning Engineer

QuinStreet

Monterrey, Nuevo Leon, Mexico (Remote)
3 Months ago
whoop - Sensor Intelligence Engineer II (Embedded Machine Learning)

whoop

Boston, Massachusetts, United States (On-Site)
3 Weeks ago
Moloco - Machine Learning Engineer II

Moloco

Seoul, South Korea (On-Site)
2 Months ago
USE Insider - Senior Machine Learning Engineer (Generative AI)

USE Insider

Istanbul, İstanbul, Türkiye (Remote)
6 Days ago
EvenUp - Machine Learning Engineer

EvenUp

San Francisco, California, United States (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Taipei City, Taiwan (On-Site)

Beijing, Beijing, China (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Dubai, Dubai, United Arab Emirates (On-Site)

Beijing, Beijing, China (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug