Senior System Reliability Engineer

5 Days ago • 6-8 Years • Research & Development • $140,000 PA - $264,500 PA

Job Summary

Job Description

NVIDIA seeks a Senior System Reliability Engineer to contribute to the reliability of their GPU servers and high-performance computing systems. Responsibilities include establishing and maintaining product reliability standards, participating in design reviews, working with suppliers and partners, defining reliability plans, performing testing and failure analysis, and correlating test results with field performance. This role requires expertise in hardware reliability engineering for electronics and server systems, including graphics cards, servers, racks, and clusters, encompassing the entire product lifecycle. The ideal candidate will have extensive experience with PCIE peripherals, graphics cards, and servers, strong statistical analysis skills, and excellent communication abilities.
Must have:
  • Hardware Reliability Engineering Expertise
  • Experience with PCIE peripherals, graphics cards, servers
  • Strong statistical analysis skills
  • Excellent communication skills
  • Design for Reliability (DfR) methods
  • Failure analysis and recommendations
Good to have:
  • MS or PhD in relevant field
Perks:
  • Competitive salary
  • Generous benefits package

Job Details

NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing — with the GPU acting as the brains of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “the AI computing company.” We're looking to grow our company and build our teams with the most thoughtful people in the world. Join us at the forefront of technological advancement. GPU Servers are one of the fastest-growing segments for NVIDIA and the Artificial Intelligence industry. As the computational power increases with every GPU generation, developing efficient and reliable systems is an imperative. We are looking for a System Reliability Engineer to join NVIDIA's existing Reliability Engineering team, involved in NVIDIA's diverse system product range specifically Graphics and High-Performance Computing printed circuit boards and Data Center Servers.


What you'll be doing:

  • Provide expertise in Hardware Reliability Engineering for Electronics/Server Systems (graphics cards, server, rack, cluster) from Concept to End-of-Life phase.

  • Establish, deliver and maintain product reliability standards and metrics for NVIDIA's new system technologies, using existing tools and processes or developing new as required.

  • Participate in product and engineering design reviews, assess the reliability budget of products/designs, and inspire changes that enhance product reliability.

  • Interface and interact with all pertinent engineering groups, suppliers, and partners ensuring the desired reliability is achieved using Design for Reliability (DfR) methods including FMEA and DoE approaches.

  • Define and implement Reliability Plans & Specifications.

  • Provide reliability predictions, along with test plans and methods to access and drive product reliability to the desired levels.

  • Perform and lead appropriate testing with associated failure analysis and recommendations for improving designs and manufacturing.

  • Develop and present methods of correlating reliability test results with actual field performance.


What we need to see:

  • BS (or equivalent experience) in Engineering, Material Science, Physics, or a related field, MS or PhD preferred.

  • 6+ years in a hardware validation/reliability environment related to PCIE peripherals, graphics cards and servers.

  • Understand power supply, memory, high speed I/O, PCI express, Ethernet and I2C.

  • Hands-on experience in theoretical and practical Reliability concepts as it relates to high-tech electronic enterprise and consumer products.

  • Have a strong command and understanding of statistical concepts/models/analysis and how they relate to product reliability & life analysis.

  • Good verbal and writing skills as well as the ability to communicate at a high level.

  • Self-motivating, independent, and committed to getting things done.

  • Good project management skills and ability to balance multiple simultaneous projects during development and production stages.

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you. Come build the future with us!

The base salary range is 140,000 USD - 264,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Techland - Game Programmer

Techland

Wrocław, Lower Silesian Voivodeship, Poland (On-Site)
4 Months ago
Playrix - Lead Unity Software Engineer (Gameplay)

Playrix

Georgia (Remote)
5 Months ago
NVIDIA - Senior Power and Thermal Engineer

NVIDIA

Canada (Hybrid)
1 Month ago
Tesla - Electrical Engineer, Motor Design - Tesla Bot

Tesla

Athens, Greece (On-Site)
2 Months ago
Maliyo Games - Unity Game Developer

Maliyo Games

Nigeria (On-Site)
5 Months ago
Virtuos - R&D Machine Learning Engineer

Virtuos

China (On-Site)
3 Weeks ago
ByteDance - Software Engineer (Applied Machine Learning - Enterprise)

ByteDance

San Jose, California, United States (On-Site)
15 Hours ago
ByteDance - LLM Software Engineer/Researcher (Applied Machine Learning)- 2024 Start (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
5 Months ago
NVIDIA - Senior Design for Debug Architect and Methodology Engineer

NVIDIA

Austin, Texas, United States (On-Site)
3 Weeks ago
Rivos - Post-Silicon Engineering Program Manager - Full Time

Rivos

Santa Clara, California, United States (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ByteDance - Software Engineer Graduate (Multi Cloud CDN)

ByteDance

San Jose, California, United States (On-Site)
15 Hours ago
Playrix - Location Game Designer

Playrix

Portugal (Remote)
5 Months ago
Ubisoft - Lead Technical Artist

Ubisoft

Annecy, Auvergne-Rhône-Alpes, France (On-Site)
3 Weeks ago
People Can Fly - Senior Technical Artist

People Can Fly

Montreal, Quebec, Canada (Remote)
5 Days ago
NVIDIA - Enterprise Software Test Development Engineer

NVIDIA

Taipei City, Taiwan (On-Site)
1 Week ago
Niantic - Technical Artist, Pokémon GO

Niantic

San Francisco, California, United States (Hybrid)
5 Days ago
undefined - Junior Game Programmer (Unreal)

Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, Malaysia (On-Site)
1 Week ago
Inwave - Cocos Developers

Inwave

(On-Site)
2 Weeks ago
Ubisoft - Technical Cinematic Designer

Ubisoft

Bordeaux, Nouvelle-Aquitaine, France (Hybrid)
16 Hours ago
Wargaming - Senior Engine Developer (World of Tanks)

Wargaming

Warsaw, Masovian Voivodeship, Poland (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Santa Clara, California, United States

The Walt Disney Company - Lead Software Engineer in Test, iOS/Android

The Walt Disney Company

Glendale, California, United States (On-Site)
5 Months ago
Twitch - Product Marketing Manager

Twitch

Irvine, California, United States (On-Site)
2 Weeks ago
NVIDIA - Senior Design Engineer

NVIDIA

Santa Clara, California, United States (On-Site)
2 Months ago
ByteDance - AR Optics Architect - Pico- San Jose

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
ByteDance - Immersive Video Research Intern (Multimedia Streaming) 2023 Summer/Fall (BS)

ByteDance

Seattle, Washington, United States (On-Site)
5 Months ago
Zoox - Software Engineer - Simulation Workload Orchestration

Zoox

Foster City, California, United States (Hybrid)
5 Months ago
Tencent - Global Business Development Manager

Tencent

California, United States (On-Site)
1 Month ago
Meetelise - AI Operations Specialist - Housing

Meetelise

New York, New York, United States (On-Site)
5 Months ago
ByteDance - Senior Software Engineer - Developer Infrastructure

ByteDance

San Jose, California, United States (On-Site)
2 Weeks ago
Rivos - Member of Technical Staff (91839)

Rivos

Santa Clara, California, United States (Hybrid)
5 Months ago

Get notifed when new similar jobs are uploaded

Research & Development Jobs

Netflix - Machine Learning Software Engineer (L5) - Content and Studio

Netflix

United States (Remote)
1 Month ago
NVIDIA - High-Speed IO Engineer

NVIDIA

Santa Clara, California, United States (Hybrid)
1 Week ago
NVIDIA - STA Backend Engineer

NVIDIA

Iași, Iași County, Romania (Remote)
1 Month ago
Rivos - Silicon Verification - Intern

Rivos

Santa Clara, California, United States (On-Site)
5 Months ago
Assystems - Design Engineer – Substation (Civil & Structural)

Assystems

Gurugram, Haryana, India (On-Site)
5 Months ago
NVIDIA - Senior Design for Debug Architect and Methodology Engineer

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
Samsung Semiconductor - Senior Staff Engineer, SoC Power Architect

Samsung Semiconductor

San Jose, California, United States (Hybrid)
2 Months ago
Krafton  - HRD Specialist (Contract)

Krafton

Seoul, South Korea (On-Site)
1 Month ago
NVIDIA - Senior Math Libraries Engineer - Sparse Linear Algebra

NVIDIA

California, United States (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.


Hanoi, Hanoi, Vietnam (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Bengaluru, Karnataka, India (On-Site)

Shanghai, Shanghai, China (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

Shanghai, Shanghai, China (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug