Diagnostic Software Manager - Server

1 Week ago • 8 Years + • Research & Development

Job Summary

Job Description

NVIDIA seeks a Diag Software Manager - Server to lead a team of software engineers responsible for developing and improving system stress applications for their data center products. This role involves collaborating with cross-functional teams (architecture, ASIC, systems engineering, operations) to create software that rigorously tests GPU servers in customer and partner environments. Responsibilities include managing multiple concurrent projects, mentoring engineers, recruiting new talent, driving root cause analysis, and developing long-term team strategies to address future challenges. The ideal candidate possesses strong system software expertise (8+ years), team management experience (4+ years), and proficiency in C/C++ and Python. The position is crucial for improving product quality and production efficiency, directly impacting NVIDIA's gross margin.
Must have:
  • 8+ years system software experience
  • 4+ years team management experience
  • Proficiency in C/C++
  • Strong system design skills
  • Understanding of computer architecture
Good to have:
  • Python programming
  • GPU compute/server tech knowledge
  • Experience with BMC, Infiniband, PCIe, NVLink
  • RAS software engineering experience
Perks:
  • Competitive salary
  • Generous benefits package

Job Details

We seek a manager to lead all aspects of a team of software engineers tasked with improving and crafting a collection of system stress applications tailored for NVIDIA's forthcoming data center products, operational within customer and partner infrastructures. Our focus lies in crafting software that subjects GPU servers to the most thorough testing scenarios imaginable. Our team collaborates closely with architecture, ASIC, systems engineering, and operations teams to devise methodologies aimed at pushing every hardware component to its limits. Situated at the core of NVIDIA's data center enterprise, from GPU baseboards to standalone servers and entire clusters, we are responsible for developing the comprehensive suite of system stress applications. We partner with NVIDIA operation teams to find efficient balance between product quality, test yield, and manufacturing efficiency. Wouldn't you want to be a key factor of NVIDIA gross margin?

What you will be doing:

  • Collaborated with multi-functional teams to do NPI project and improve and refine software deployed on our customers' servers and environments, facilitating detailed identification of hardware or software issues.

  • As the manager, you will run multiple concurrent projects through active prioritization, and communication.

  • On the engineer management side, we want the manager to continue to groom future technical leaders in the team and recruit new talent.

  • Constant development is another area of responsibility. We look for candidates who are proactive - seek opportunities to improve NVIDIA product quality and production efficiency.

  • We also need our candidates to be reactive: be able to drive root cause of critical issues and embrace corrective actions.

  • Finally, we need our leaders to develop long range strategies for the team to prepare for new challenges and drive execution.

What we need to see:

  • Bachelor of science in Computer Science, Computer Engineering, Electrical Engineering (or equivalent experience).

  • 8+ overall years of system software experience, deep understanding of software development principles, comfortable working in large code space and deep driver stack with 4+ years of team management experience

  • Good system design skills

  • Good programming skills in C/C++, python programming is a plus.

  • Solid understanding in computer architecture, operating system, kernel driver, device programming.

  • Experience driving feature development and multi-team debug.

Ways to stand out from the crowd:

  • Knowledge of GPU compute or server product technologies like BMC (Baseboard Management Controller), Infiniband, PCIE, NVLink.

  • Extensive experience collaborating with customer software teams

  • Strong experience to engineer software with consideration of RAS

  • Comfortable with unknown and change

With competitive salaries and a generous benefits package, NVIDIA is widely considered to be one of the most desirable employers in the world. We have some of the most brilliant and talented people in the world working for us. If you are creative, autonomous and love a challenge, we want to hear from you. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

#LI-Hybrid 

Similar Jobs

NVIDIA - Software Engineering Intern, Autonomous Vehicles (RDSS)

NVIDIA

Taipei City, Taiwan (On-Site)
2 Months ago
Ubisoft - Senior AI programmer

Ubisoft

Singapore, Singapore (On-Site)
4 Months ago
NVIDIA - Software Engineering Intern, Autonomous Vehicle Product - 2025

NVIDIA

Shanghai, Shanghai, China (On-Site)
2 Months ago
The Walt Disney Company - Principal Core Systems Engineer

The Walt Disney Company

Copenhagen, Denmark (Remote)
1 Week ago
ION - Technical Support Analyst, Jersey City - 9781

ION

Jersey City, New Jersey, United States (On-Site)
5 Months ago
Netflix - Engineering Manager, Page Scale & Performance Team

Netflix

United States (Remote)
1 Month ago
Spin Master - Senior Lead Model Maker

Spin Master

California, United States (On-Site)
1 Week ago
NVIDIA - Senior Software and Cloud Architect

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Month ago
NVIDIA - Senior Physical Design Methodology Engineer, PPA Fusion Compiler

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
NVIDIA - Senior Software Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Epic Games - Senior Physics Programmer

Epic Games

Cary, North Carolina, United States (On-Site)
2 Months ago
NVIDIA - Senior Firmware Verification Engineer, PCIe

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Weeks ago
Epic Games - Tech UI Designer

Epic Games

Porto Alegre, State Of Rio Grande Do Sul, Brazil (On-Site)
1 Week ago
NVIDIA - Senior Power Architecture and Optimization Engineer

NVIDIA

Santa Clara, California, United States (On-Site)
2 Months ago
NVIDIA - Deep Learning Performance Architect

NVIDIA

Hyderabad, Telangana, India (Hybrid)
1 Month ago
IO Interactive - Senior Online Programmer

IO Interactive

Malmö, Skåne County, Sweden (Hybrid)
5 Days ago
Gunzilla - Senior Gameplay Programmer

Gunzilla

Kyiv, Kyiv City, Ukraine (On-Site)
1 Week ago
Meta - Research Scientist Intern, Photorealistic Telepresence (PhD)

Meta

Redmond, Washington, United States (On-Site)
4 Months ago
Keen Software House - Senior Render Programmer

Keen Software House

Prague, Prague, Czechia (Remote)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Taipei City, Taiwan

NVIDIA - Staff Systems Software Engineer - Server

NVIDIA

Hsinchu, Hsinchu City, Taiwan (On-Site)
1 Week ago
NVIDIA - Senior Silicon Photonics Test Engineer

NVIDIA

Hsinchu, Hsinchu City, Taiwan (On-Site)
2 Months ago
Corsair - Senior Hardware Development Engineer

Corsair

Taipei City, Taiwan (On-Site)
1 Week ago
Maersk Careers - People Advisor

Maersk Careers

Taoyuan City, Taiwan (On-Site)
5 Months ago
Rivos - CPU Physical Design - Full Time

Rivos

Hsinchu, Hsinchu City, Taiwan (Hybrid)
5 Months ago
NVIDIA - Design Verification Engineer - PCIE

NVIDIA

Taipei City, Taiwan (On-Site)
1 Week ago
NVIDIA - Mixed Signal Design Engineer (RDSS Intern)

NVIDIA

Hsinchu, Hsinchu City, Taiwan (On-Site)
2 Months ago
Appier - Campaign Analyst

Appier

Taipei City, Taiwan (On-Site)
5 Months ago
Netflix - Creative Manager - Taiwan

Netflix

Taipei City, Taiwan (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

Research & Development Jobs

ByteDance - Experienced Technical Lead - Edge Cloud Infrastructure - San Jose / Seattle / Boston

ByteDance

Seattle, Washington, United States (On-Site)
5 Months ago
Rivos - DFT Engineer

Rivos

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
The Walt Disney Company - Mechanical Designer, CAD Designer

The Walt Disney Company

Shanghai, Shanghai, China (On-Site)
1 Month ago
Tesla - Electrical Assembly Supervisor

Tesla

Prüm, Rhineland-Palatinate, Germany (On-Site)
1 Month ago
Samsung Semiconductor - Senior Staff Engineer, DRAM

Samsung Semiconductor

San Jose, California, United States (Hybrid)
3 Months ago
NVIDIA - Senior Applied Power Architect - GPU

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
NVIDIA - GPU Kernel Software Engineering Intern - 2025

NVIDIA

Shanghai, Shanghai, China (On-Site)
2 Months ago
ByteDance - Site Reliability Engineer, ML System

ByteDance

Seattle, Washington, United States (On-Site)
5 Months ago
Power Integrations - Staff Automotive Reliability Engineer

Power Integrations

Penang, Malaysia (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.


Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Ra'anana, Center District, Israel (On-Site)

Ra'anana, Center District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug