Diagnostic Software Manager - Server

1 Month ago • 8 Years + • Research & Development

Job Summary

Job Description

NVIDIA seeks a Diag Software Manager - Server to lead a team of software engineers responsible for developing and improving system stress applications for their data center products. This role involves collaborating with cross-functional teams (architecture, ASIC, systems engineering, operations) to create software that rigorously tests GPU servers in customer and partner environments. Responsibilities include managing multiple concurrent projects, mentoring engineers, recruiting new talent, driving root cause analysis, and developing long-term team strategies to address future challenges. The ideal candidate possesses strong system software expertise (8+ years), team management experience (4+ years), and proficiency in C/C++ and Python. The position is crucial for improving product quality and production efficiency, directly impacting NVIDIA's gross margin.
Must have:
  • 8+ years system software experience
  • 4+ years team management experience
  • Proficiency in C/C++
  • Strong system design skills
  • Understanding of computer architecture
Good to have:
  • Python programming
  • GPU compute/server tech knowledge
  • Experience with BMC, Infiniband, PCIe, NVLink
  • RAS software engineering experience
Perks:
  • Competitive salary
  • Generous benefits package

Job Details

We seek a manager to lead all aspects of a team of software engineers tasked with improving and crafting a collection of system stress applications tailored for NVIDIA's forthcoming data center products, operational within customer and partner infrastructures. Our focus lies in crafting software that subjects GPU servers to the most thorough testing scenarios imaginable. Our team collaborates closely with architecture, ASIC, systems engineering, and operations teams to devise methodologies aimed at pushing every hardware component to its limits. Situated at the core of NVIDIA's data center enterprise, from GPU baseboards to standalone servers and entire clusters, we are responsible for developing the comprehensive suite of system stress applications. We partner with NVIDIA operation teams to find efficient balance between product quality, test yield, and manufacturing efficiency. Wouldn't you want to be a key factor of NVIDIA gross margin?

What you will be doing:

  • Collaborated with multi-functional teams to do NPI project and improve and refine software deployed on our customers' servers and environments, facilitating detailed identification of hardware or software issues.

  • As the manager, you will run multiple concurrent projects through active prioritization, and communication.

  • On the engineer management side, we want the manager to continue to groom future technical leaders in the team and recruit new talent.

  • Constant development is another area of responsibility. We look for candidates who are proactive - seek opportunities to improve NVIDIA product quality and production efficiency.

  • We also need our candidates to be reactive: be able to drive root cause of critical issues and embrace corrective actions.

  • Finally, we need our leaders to develop long range strategies for the team to prepare for new challenges and drive execution.

What we need to see:

  • Bachelor of science in Computer Science, Computer Engineering, Electrical Engineering (or equivalent experience).

  • 8+ overall years of system software experience, deep understanding of software development principles, comfortable working in large code space and deep driver stack with 4+ years of team management experience

  • Good system design skills

  • Good programming skills in C/C++, python programming is a plus.

  • Solid understanding in computer architecture, operating system, kernel driver, device programming.

  • Experience driving feature development and multi-team debug.

Ways to stand out from the crowd:

  • Knowledge of GPU compute or server product technologies like BMC (Baseboard Management Controller), Infiniband, PCIE, NVLink.

  • Extensive experience collaborating with customer software teams

  • Strong experience to engineer software with consideration of RAS

  • Comfortable with unknown and change

With competitive salaries and a generous benefits package, NVIDIA is widely considered to be one of the most desirable employers in the world. We have some of the most brilliant and talented people in the world working for us. If you are creative, autonomous and love a challenge, we want to hear from you. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

#LI-Hybrid 

Similar Jobs

Wind River Systems - Star Lab - Principal Technologist - Embedded Security Professional Services

Wind River Systems

Huntsville, Ontario, Canada (On-Site)
6 Months ago
N-iX - Senior AQA Engineer (Python + Robot)

N-iX

Colombia (Remote)
2 Months ago
NVIDIA - Senior AI-HPC Cluster Engineer

NVIDIA

Westford, Massachusetts, United States (Hybrid)
1 Month ago
Google - Video Streaming Specialist

Google

Hyderabad, Telangana, India (On-Site)
1 Week ago
Ethernovia - Senior Embedded Software Engineer

Ethernovia

Pune, Maharashtra, India (On-Site)
6 Hours ago
ByteDance - LLM Software Engineer/Researcher (Applied Machine Learning)

ByteDance

Seattle, Washington, United States (On-Site)
1 Month ago
ByteDance - Research Scientist, Multimodality

ByteDance

San Jose, California, United States (On-Site)
6 Months ago
Google - Software Engineer III, Device Build and Release, Pixel

Google

Warsaw, Masovian Voivodeship, Poland (On-Site)
2 Weeks ago
Cadence - Design Engineering Manager

Cadence

Bengaluru, Karnataka, India (On-Site)
7 Months ago
Google - Firmware Engineer, Pixel System Software

Google

New Taipei, New Taipei City, Taiwan (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

NVIDIA - Automation Software Engineer

NVIDIA

Yokne'am Illit, North District, Israel (On-Site)
1 Week ago
Enphase Energy - Software Engineer - Oracle Apex

Enphase Energy

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Cirrus Logic - Applications Engineer

Cirrus Logic

Austin, Texas, United States (Hybrid)
2 Months ago
temperature pro - Software Developer

temperature pro

Dallas, Texas, United States (On-Site)
1 Day ago
Loyalty Juggernaut - Product Engineer (Python)

Loyalty Juggernaut

Hyderabad, Telangana, India (On-Site)
1 Year ago
Google - Senior Business Data Scientist, Behavioral Economics

Google

Mexico City, Mexico City, Mexico (On-Site)
2 Days ago
PHINIA - Server Lead

PHINIA

Bengaluru, Karnataka, India (On-Site)
23 Hours ago
Google - Software Engineering Manager (For Women in Tech Candidates)

Google

São Paulo, State Of São Paulo, Brazil (On-Site)
5 Months ago
Light Speed Studios - Senior Technical Artist

Light Speed Studios

Tokyo, Japan (On-Site)
1 Month ago
Egnyte - Staff Software Engineer

Egnyte

Mountain View, California, United States (Hybrid)
5 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Taipei City, Taiwan

Google - Software Engineer II, Engineering Productivity, Google Home

Google

New Taipei, New Taipei City, Taiwan (On-Site)
2 Weeks ago
NVIDIA - Structural Test Engineer (RDSS Intern)

NVIDIA

Hsinchu, Hsinchu City, Taiwan (On-Site)
3 Months ago
Visa - Senior Consulting Manager, Visa Managed Services

Visa

Taipei City, Taiwan (On-Site)
6 Months ago
NVIDIA - Research Scientist, Circuits

NVIDIA

Taipei City, Taiwan (On-Site)
3 Months ago
NVIDIA - Senior ASIC Verification Engineer, Coherent High Speed Interconnect

NVIDIA

Taipei City, Taiwan (On-Site)
1 Month ago
Google - Hardware Reliability Engineer

Google

Taipei City, Taiwan (On-Site)
1 Week ago
Google - Firmware Engineer, AS Layer 3, Modem Reliability Engineering

Google

New Taipei City, Taiwan (On-Site)
2 Weeks ago
Google - SoC ATE Test Engineer

Google

Taipei City, Taiwan (On-Site)
2 Weeks ago
Google - Software Engineer III, Auto Exposure, Pixel Camera

Google

Hsinchu County, Taiwan (On-Site)
2 Days ago
WildBrain - Licensing Manager

WildBrain

Taipei City, Taiwan (Hybrid)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Research & Development Jobs

NVIDIA - Physical Design Backend Engineer

NVIDIA

Be'er Sheva, South District, Israel (On-Site)
2 Months ago
Tesla - Mechanical Design Engineer Internship

Tesla

Brandenburg, Germany (On-Site)
2 Months ago
Microsoft - Technical Support Engineer

Microsoft

Hyderabad, Telangana, India (Hybrid)
3 Days ago
Riot Games - Senior Researcher, Wild Rift

Riot Games

Shanghai, Shanghai, China (On-Site)
2 Weeks ago
NVIDIA - Engineering Manager, AI Developer Technology

NVIDIA

Austin, Texas, United States (On-Site)
1 Month ago
NVIDIA - Senior System Software Engineer, GPU

NVIDIA

Taipei City, Taiwan (On-Site)
1 Week ago
Google - Silicon Networking Microarchitecture and RTL Lead

Google

Bengaluru, Karnataka, India (On-Site)
2 Days ago
NVIDIA - Senior DFT Verification Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Months ago
Google - CPU Logic Design Engineer

Google

Haifa, Haifa District, Israel (On-Site)
1 Week ago
Google - Software Engineer III, Education and Activation, Core

Google

Mexico City, Mexico City, Mexico (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug