Senior Performance Engineer

23 Hours ago • 8 Years + • Research & Development • $184,000 PA - $356,500 PA

Job Summary

Job Description

NVIDIA seeks a Senior Performance Engineer to lead performance practices in large-scale AI infrastructure. Responsibilities include aligning next-generation AI workloads with datacenter designs, delivering performance insights, resolving complex issues, and collaborating with various teams (HW/FW/SW). The role involves working with accelerated computing software stacks, DL frameworks (CUDA, PyTorch), cloud architectures, and container technologies. This individual will analyze, debug, and resolve critical issues to optimize AI workload performance at scale.
Must have:
  • 8+ years in accelerated computing
  • Understanding of accelerated computing software stacks and DL Frameworks
  • Experience with modern Cloud and container-based architectures
  • C/C++/Python/Bash programming
  • Experience with CPU architecture
  • Understanding of collective communication in AI workloads
Good to have:
  • At-scale DL training experience
  • DL and graph compiling programming skills
  • Exposure to virtualization techniques and cloud platform solutions
  • Experience with large scale HPC environments
Perks:
  • Equity
  • Benefits

Job Details

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

We are looking for an outstanding engineer for a Senior Performance Engineer role for at scale AI system performance and datacenter applications. Be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and GPU computing! Provide insights on at-scale system design and tuning mechanisms for large-scale compute runs. You will work with the latest Accelerated Computing and Deep Learning software and hardware platforms, and with many researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, CPU and GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.

What you'll be doing:

  • Lead all aspects of implementing performance practices in large scale infrastructure, deliver powerful tools, methodologies, and flows to validate and improve several datacenter products in parallel.

  • Specific responsibilities include aligning the next generation AI workloads on top of next generation datacenter designs. This involves early engagement with HW/FW/SW/platform internal and customer teams, and other groups.

  • Deliver engineering solutions to deliver continuous insights into performance of AI workloads over evolving environments, generating quick insights to improvements and regressions over time.

  • Decompose high-complexity performance or stability issues into minimal reproduction cases, working towards final root cause of underlying problems.

  • Participating in engagements with various SW and FW (BMC/SBIOS/OS/drivers etc) teams to develop best-in-class practices and tools, you will be analyzing, debugging and resolving critical firmware and software issues for the best AI workload performance at scale.

What we need to see:

  • 8+ years of experience in using accelerated computing for datacenter container computing solutions.

  • Proven understanding of accelerated computing software stacks and DL Frameworks (CUDA, PyTorch).

  • Experience using and handling modern Cloud and container-based Enterprise computing architectures.

  • C/C++/Python/Bash programming/scripting experience.

  • Experience with CPU architecture.

  • Experience with container technology and Linux based OSes.

  • Understanding of collective communication and the patterns in AI workloads.

  • Experience working with engineering or academic research community supporting high performance computing or deep learning.

  • Strong verbal and written communication skills as well as strong teamwork and communication skills.

  • Action driven with strong analytical and analytical skills.

  • BS in Engineering, Mathematics, Physics, or Computer Science, MS or PhD desirable (or equivalent experience).

Ways to Stand Out From the Crowd:

  • At-scale DL training experience.

  • DL and graph compiling programming skills.

  • Exposure to virtualization techniques, cloud platform solutions.

  • Exposure to scheduling and resource management systems.

  • Experience with large scale HPC environments.

The base salary range is 184,000 USD - 356,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Tencent - Cloud Engineer

Tencent

(On-Site)
5 Months ago
ION - Cloud Engineer Kubernetes

ION

Rome, Lazio, Italy (Hybrid)
6 Months ago
Nielsen Holdings - DevOps Engineer (Terraform, Jenkins, GitLab CI/CD, Python, Airflow)

Nielsen Holdings

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Playrix - Senior Release Engineer

Playrix

Serbia (Remote)
5 Months ago
The Walt Disney Company - Sr Streaming Media Engineer

The Walt Disney Company

New York, New York, United States (On-Site)
2 Months ago
NVIDIA - System Software Manager - Multimedia

NVIDIA

Pune, Maharashtra, India (On-Site)
1 Day ago
Regent Craft - Mechanical Engineering Intern

Regent Craft

North Kingstown, Rhode Island, United States (On-Site)
6 Months ago
NVIDIA - Senior DFT Verification Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Month ago
Luxoft - Regular BSP Developer

Luxoft

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
NVIDIA - Senior Silicon Validation and Productization Engineer

NVIDIA

Santa Clara, California, United States (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Rivos - SOC Static Timing Analysis Engineer - Full Time

Rivos

Hsinchu, Hsinchu City, Taiwan (On-Site)
6 Months ago
PwC - Cloud Security Engineering - Senior Manager

PwC

Prague, Prague, Czechia (On-Site)
6 Months ago
Google - CPU Design Verification Engineer

Google

Portland, Oregon, United States (On-Site)
1 Week ago
Rackspace Technology - Senior GCP Cloud Engineer

Rackspace Technology

United States (Remote)
4 Weeks ago
Luxoft - Google Cloud Engineer

Luxoft

New Delhi, Delhi, India (Remote)
4 Months ago
NVIDIA - Vehicle Adaptation Intern - 2025

NVIDIA

Stuttgart, Baden-Württemberg, Germany (On-Site)
2 Weeks ago
Aristocrat Gaming - NOC Engineer

Aristocrat Gaming

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
4 Weeks ago
Booming games - PostgreSQL Developer

Booming games

Prague, Prague, Czechia (Remote)
2 Days ago
NVIDIA - Senior VLSI Integration Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (Hybrid)
1 Month ago
ByteDance - Linux Kernel Software Engineer

ByteDance

San Jose, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Santa Clara, California, United States

Salesforce - Distributed Systems Software Engineer - Public Cloud (Senior/Lead/Principal)

Salesforce

San Francisco, California, United States (On-Site)
7 Months ago
Activision - Lead Single Player Systems Designer

Activision

Wisconsin, United States (Hybrid)
5 Days ago
ByteDance - Strategy Analyst – Strategy & Operations

ByteDance

Seattle, Washington, United States (On-Site)
1 Week ago
Google - Global Supply Chain Manager, Mechanical

Google

Mountain View, California, United States (On-Site)
1 Week ago
Crunchyroll - Principal Technical Product Manager - Application Security

Crunchyroll

San Francisco, California, United States (On-Site)
1 Month ago
Niantic - Senior UX Designer

Niantic

Bellevue, Washington, United States (Hybrid)
1 Month ago
Evolution - In- Studio LIVE Game Presenter - Full Benefits, $20 - $25/hr- NO EXPERIENCE NECESSARY

Evolution

Atlantic City, New Jersey, United States (On-Site)
9 Months ago
Google - Hardware Architecture Modeling Engineer, Accelerators

Google

Sunnyvale, California, United States (On-Site)
1 Week ago
Google - Senior Software Engineer, AI/ML GenAI, Google Cloud AI

Google

Kirkland, Washington, United States (On-Site)
1 Week ago
Next Level Business Services - Sr. Tableau Admin

Next Level Business Services

Cupertino, California, United States (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Research & Development Jobs

Assystems - Design Engineer – Substation (Civil & Structural)

Assystems

Gurugram, Haryana, India (On-Site)
5 Months ago
NVIDIA - Senior Mask Layout Design Engineer

NVIDIA

Taipei City, Taiwan (On-Site)
3 Weeks ago
KPIT - Android Middleware Developer/Lead/Architect

KPIT

Bengaluru, Karnataka, India (On-Site)
8 Months ago
Assystems - Ingénieur d'Etudes Electricité H/F

Assystems

Lyon, Auvergne-Rhône-Alpes, France (On-Site)
5 Months ago
NVIDIA - CAD Engineer

NVIDIA

Bengaluru, Karnataka, India (On-Site)
3 Weeks ago
ByteDance - Software Engineer, Inference

ByteDance

Seattle, Washington, United States (On-Site)
5 Months ago
Microsoft - Research Intern - Sustainability in Total Cost of Ownership

Microsoft

Hillsboro, Oregon, United States (On-Site)
6 Days ago
Virtuos - Junior QC

Virtuos

Vietnam (On-Site)
1 Week ago
NVIDIA - Senior Chip Design Methodologies Engineer

NVIDIA

Yokne'am Illit, North District, Israel (On-Site)
2 Months ago
Cadence - Lead C++ Software Engineer

Cadence

San Jose, California, United States (On-Site)
7 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug