Senior System Reliability Engineer

3 Weeks ago • 6-8 Years • Research & Development • $140,000 PA - $264,500 PA

Job Summary

Job Description

NVIDIA seeks a Senior System Reliability Engineer to contribute to the reliability of their GPU servers and high-performance computing systems. Responsibilities include establishing and maintaining product reliability standards, participating in design reviews, working with suppliers and partners, defining reliability plans, performing testing and failure analysis, and correlating test results with field performance. This role requires expertise in hardware reliability engineering for electronics and server systems, including graphics cards, servers, racks, and clusters, encompassing the entire product lifecycle. The ideal candidate will have extensive experience with PCIE peripherals, graphics cards, and servers, strong statistical analysis skills, and excellent communication abilities.
Must have:
  • Hardware Reliability Engineering Expertise
  • Experience with PCIE peripherals, graphics cards, servers
  • Strong statistical analysis skills
  • Excellent communication skills
  • Design for Reliability (DfR) methods
  • Failure analysis and recommendations
Good to have:
  • MS or PhD in relevant field
Perks:
  • Competitive salary
  • Generous benefits package

Job Details

NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing — with the GPU acting as the brains of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “the AI computing company.” We're looking to grow our company and build our teams with the most thoughtful people in the world. Join us at the forefront of technological advancement. GPU Servers are one of the fastest-growing segments for NVIDIA and the Artificial Intelligence industry. As the computational power increases with every GPU generation, developing efficient and reliable systems is an imperative. We are looking for a System Reliability Engineer to join NVIDIA's existing Reliability Engineering team, involved in NVIDIA's diverse system product range specifically Graphics and High-Performance Computing printed circuit boards and Data Center Servers.


What you'll be doing:

  • Provide expertise in Hardware Reliability Engineering for Electronics/Server Systems (graphics cards, server, rack, cluster) from Concept to End-of-Life phase.

  • Establish, deliver and maintain product reliability standards and metrics for NVIDIA's new system technologies, using existing tools and processes or developing new as required.

  • Participate in product and engineering design reviews, assess the reliability budget of products/designs, and inspire changes that enhance product reliability.

  • Interface and interact with all pertinent engineering groups, suppliers, and partners ensuring the desired reliability is achieved using Design for Reliability (DfR) methods including FMEA and DoE approaches.

  • Define and implement Reliability Plans & Specifications.

  • Provide reliability predictions, along with test plans and methods to access and drive product reliability to the desired levels.

  • Perform and lead appropriate testing with associated failure analysis and recommendations for improving designs and manufacturing.

  • Develop and present methods of correlating reliability test results with actual field performance.


What we need to see:

  • BS (or equivalent experience) in Engineering, Material Science, Physics, or a related field, MS or PhD preferred.

  • 6+ years in a hardware validation/reliability environment related to PCIE peripherals, graphics cards and servers.

  • Understand power supply, memory, high speed I/O, PCI express, Ethernet and I2C.

  • Hands-on experience in theoretical and practical Reliability concepts as it relates to high-tech electronic enterprise and consumer products.

  • Have a strong command and understanding of statistical concepts/models/analysis and how they relate to product reliability & life analysis.

  • Good verbal and writing skills as well as the ability to communicate at a high level.

  • Self-motivating, independent, and committed to getting things done.

  • Good project management skills and ability to balance multiple simultaneous projects during development and production stages.

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you. Come build the future with us!

The base salary range is 140,000 USD - 264,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Ubisoft - Senior Programmer [Unity]

Ubisoft

Shanghai, Shanghai, China (On-Site)
7 Months ago
Inworld AI - People Ops/HR Lead

Inworld AI

Mountain View, California, United States (Hybrid)
2 Months ago
Activision - Principal Engine Programmer

Activision

Warsaw, Masovian Voivodeship, Poland (Hybrid)
2 Months ago
The Walt Disney Company - Senior Manager, Data Analytics, DTC ANZ

The Walt Disney Company

Richmond, Victoria, Australia (On-Site)
2 Weeks ago
Flying Wild Hog - Senior Technical Animator

Flying Wild Hog

(Remote)
2 Months ago
ByteDance - Imaging System Architect

ByteDance

San Jose, California, United States (On-Site)
1 Week ago
NVIDIA - Senior Research Engineer for Reinforcement Learning

NVIDIA

Canada (On-Site)
2 Months ago
Tencent - Senior Researcher, Natural Language Processing

Tencent

(On-Site)
2 Months ago
ByteDance - Research Scientist, Infrastructure System Lab

ByteDance

San Jose, California, United States (On-Site)
1 Month ago
ByteDance - Senior Site Reliability Engineer, ML System

ByteDance

Seattle, Washington, United States (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Good Job Games - Senior Software Engineer

Good Job Games

İstanbul, Türkiye (On-Site)
5 Months ago
Gym Class VR - 3D Artist - Generalist (Senior / Staff / Principal)

Gym Class VR

(Remote)
1 Day ago
Google - Senior Product Data Scientist, Google Education

Google

Mexico City, Mexico City, Mexico (On-Site)
1 Week ago
Inworld AI - People Ops/HR Lead

Inworld AI

Mountain View, California, United States (Hybrid)
1 Month ago
everi - Developer Software Principal IV (Games)

everi

Reno, Nevada, United States (Hybrid)
5 Months ago
People Can Fly - Senior Sound Designer

People Can Fly

Poland (On-Site)
5 Months ago
Playrix - Senior C++ Software Engineer (Tools)

Playrix

Portugal (Remote)
6 Months ago
Bohemia Interactive - Lead Programmer

Bohemia Interactive

Prague, Prague, Czechia (On-Site)
5 Months ago
Meta - Product Technical Program Manager

Meta

New York, New York, United States (Remote)
5 Months ago
Funkitron - Casual Mobile Free To Play Unity Game Programmer

Funkitron

(Remote)
5 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Santa Clara, California, United States

Framestore - FREELANCE: CG - LOS ANGELES

Framestore

Los Angeles, California, United States (On-Site)
9 Months ago
anavatio  - DevOps Engineer

anavatio

Lorton, Virginia, United States (Hybrid)
4 Weeks ago
Aristocrat Gaming - Software Engineer

Aristocrat Gaming

Las Vegas, Nevada, United States (Hybrid)
2 Months ago
Next Level Business Services - Teradata DBA

Next Level Business Services

San Francisco, California, United States (On-Site)
6 Months ago
The Walt Disney Company - Recreation Lifeguard - Part-Time

The Walt Disney Company

Hilton Head Island, South Carolina, United States (On-Site)
2 Weeks ago
Onward Search - UX/UI Designer

Onward Search

Maryland City, Maryland, United States (Remote)
3 Days ago
Coherent Corp - Manufacturing Operator

Coherent Corp

Easton, Pennsylvania, United States (On-Site)
1 Week ago
ByteDance - Research Scientist Graduate (Computational Biology (AI-for-Science))

ByteDance

Seattle, Washington, United States (On-Site)
2 Weeks ago
IGT - Lottery Field Service Technician I-(Austin,TX )

IGT

Texas, United States (On-Site)
4 Months ago
Google - Senior Data Scientist, Research, Storage Analytics

Google

Sunnyvale, California, United States (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

Research & Development Jobs

Tencent - Software Engineering Associate 104534

Tencent

Singapore (On-Site)
4 Months ago
Google - Software Engineer, gReach Program for People with Disabilities

Google

Shanghai, Shanghai, China (On-Site)
2 Weeks ago
Google - Senior Staff Software Engineer, Looker Modeling

Google

Kirkland, Washington, United States (On-Site)
1 Week ago
Riot Games - Associate Art Director - League of Legends, Game Modes

Riot Games

Sydney, New South Wales, Australia (On-Site)
10 Months ago
NVIDIA - Senior Software Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Months ago
Fluence - Sr. Software Architect (m/f/d)

Fluence

Berlin, Berlin, Germany (On-Site)
6 Months ago
VGW - Customer Research Team Lead

VGW

Perth, Western Australia, Australia (On-Site)
1 Month ago
NVIDIA - Senior High-Performance System Architect

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Month ago
Assystems - Lead Electrical Engineer (LV/HT/ELV)

Assystems

Gurugram, Haryana, India (On-Site)
6 Months ago
Virtuos - Junior Software Engineer

Virtuos

Dublin, County Dublin, Ireland (Hybrid)
1 Week ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug