Senior System Reliability Engineer

2 Months ago • 6-8 Years • Devops • $140,000 PA - $264,500 PA

Job Summary

Job Description

NVIDIA seeks a Senior System Reliability Engineer to contribute to the reliability of their GPU servers and high-performance computing systems. Responsibilities include establishing and maintaining product reliability standards, participating in design reviews, working with suppliers and partners, defining reliability plans, performing testing and failure analysis, and correlating test results with field performance. This role requires expertise in hardware reliability engineering for electronics and server systems, including graphics cards, servers, racks, and clusters, encompassing the entire product lifecycle. The ideal candidate will have extensive experience with PCIE peripherals, graphics cards, and servers, strong statistical analysis skills, and excellent communication abilities.
Must have:
  • Hardware Reliability Engineering Expertise
  • Experience with PCIE peripherals, graphics cards, servers
  • Strong statistical analysis skills
  • Excellent communication skills
  • Design for Reliability (DfR) methods
  • Failure analysis and recommendations
Good to have:
  • MS or PhD in relevant field
Perks:
  • Competitive salary
  • Generous benefits package

Job Details

NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing — with the GPU acting as the brains of computers, robots, and self-driving cars that can perceive and understand the world. Today, we are increasingly known as “the AI computing company.” We're looking to grow our company and build our teams with the most thoughtful people in the world. Join us at the forefront of technological advancement. GPU Servers are one of the fastest-growing segments for NVIDIA and the Artificial Intelligence industry. As the computational power increases with every GPU generation, developing efficient and reliable systems is an imperative. We are looking for a System Reliability Engineer to join NVIDIA's existing Reliability Engineering team, involved in NVIDIA's diverse system product range specifically Graphics and High-Performance Computing printed circuit boards and Data Center Servers.


What you'll be doing:

  • Provide expertise in Hardware Reliability Engineering for Electronics/Server Systems (graphics cards, server, rack, cluster) from Concept to End-of-Life phase.

  • Establish, deliver and maintain product reliability standards and metrics for NVIDIA's new system technologies, using existing tools and processes or developing new as required.

  • Participate in product and engineering design reviews, assess the reliability budget of products/designs, and inspire changes that enhance product reliability.

  • Interface and interact with all pertinent engineering groups, suppliers, and partners ensuring the desired reliability is achieved using Design for Reliability (DfR) methods including FMEA and DoE approaches.

  • Define and implement Reliability Plans & Specifications.

  • Provide reliability predictions, along with test plans and methods to access and drive product reliability to the desired levels.

  • Perform and lead appropriate testing with associated failure analysis and recommendations for improving designs and manufacturing.

  • Develop and present methods of correlating reliability test results with actual field performance.


What we need to see:

  • BS (or equivalent experience) in Engineering, Material Science, Physics, or a related field, MS or PhD preferred.

  • 6+ years in a hardware validation/reliability environment related to PCIE peripherals, graphics cards and servers.

  • Understand power supply, memory, high speed I/O, PCI express, Ethernet and I2C.

  • Hands-on experience in theoretical and practical Reliability concepts as it relates to high-tech electronic enterprise and consumer products.

  • Have a strong command and understanding of statistical concepts/models/analysis and how they relate to product reliability & life analysis.

  • Good verbal and writing skills as well as the ability to communicate at a high level.

  • Self-motivating, independent, and committed to getting things done.

  • Good project management skills and ability to balance multiple simultaneous projects during development and production stages.

With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you. Come build the future with us!

The base salary range is 140,000 USD - 264,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Blinkhealth - Bilingual Vietnamese - Customer Support Specialist

Blinkhealth

Chesterfield, Missouri, United States (On-Site)
1 Week ago
Ubisoft - Lead FX Artist

Ubisoft

Montpellier, Occitanie, France (On-Site)
3 Months ago
oportun - Senior Data Engineer

oportun

India (Remote)
1 Month ago
Nice - Customer Success Manager

Nice

United States (Remote)
2 Days ago
Springer Group - Enterprise Architect

Springer Group

Berlin, Berlin, Germany (Remote)
1 Month ago
Extreme Inc. - Cloud Engineer

Extreme Inc.

Tokyo, Tokyo, Japan (Hybrid)
2 Months ago
bytedance - Backend Software Engineer - Customer Service Platform

bytedance

Singapore (On-Site)
8 Months ago
Thales - Software Digital Radar Architect

Thales

Limours, Île-de-France, France (Hybrid)
1 Month ago
HCL Tech - Enterprise solution architect

HCL Tech

New Jersey, United States (On-Site)
1 Month ago
bytedance - Senior Software Engineer - Compute Infrastructure (Orchestration & Scheduling)

bytedance

Seattle, Washington, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Dream Sports - Senior Manager - Premium Sales

Dream Sports

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Krafton - Strategy Manager (AI Ethics)

Krafton

Seoul, South Korea (On-Site)
3 Months ago
Accenture - Senior Manager

Accenture

Gurugram, Haryana, India (On-Site)
1 Month ago
Survay Monkey - Senior Demand Generation Manager

Survay Monkey

Heredia, Costa Rica (Hybrid)
1 Month ago
Egnyte - Senior Manager, Order Management

Egnyte

Mountain View, California, United States (Hybrid)
5 Months ago
smarsh - Cloud Engineer III-Kubernetes

smarsh

India (Hybrid)
4 Months ago
Haleon - People Analytics Insight Specialist

Haleon

England, United Kingdom (On-Site)
3 Weeks ago
2K - Senior Product Manager - Horizon

2K

Austin, Texas, United States (On-Site)
1 Month ago
lifechruh - Senior Program Manager

lifechruh

Edmond, Oklahoma, United States (On-Site)
8 Months ago
PwC - Senior Associate_Power BI_Data and Analytics

PwC

Mumbai, Maharashtra, India (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in Santa Clara, California, United States

level ai - Brand Marketing Manager

level ai

California, United States (Hybrid)
2 Weeks ago
1000heads - Senior Insights Manager, Social Media

1000heads

Miami, Florida, United States (On-Site)
3 Years ago
Apple - Staff Software Engineer, Security Architecture

Apple

San Diego, California, United States (On-Site)
1 Week ago
Nintendo - CONTRACT - Email Marketing Coordinator

Nintendo

Redmond, Washington, United States (Hybrid)
7 Months ago
Discord - Senior Litigation Counsel

Discord

San Francisco, California, United States (Remote)
2 Months ago
Gearbox - Senior Outsourcing Manager

Gearbox

Frisco, Texas, United States (On-Site)
1 Month ago
hogarth - QA Engineer

hogarth

Sunnyvale, California, United States (Hybrid)
3 Weeks ago
31st Union - Expert Core Engineer

31st Union

San Mateo, California, United States (On-Site)
2 Months ago
pentair - Engineering Technician III

pentair

Sanford, North Carolina, United States (On-Site)
2 Months ago
Yahoo - Senior Backend Engineer

Yahoo

Reston, Virginia, United States (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Devops Jobs

DigitalOcean - Senior Cloud Support Engineer

DigitalOcean

Hyderabad, Telangana, India (Hybrid)
8 Months ago
Applied materials  - Storage Lead Solution Architect

Applied materials

Bengaluru, Karnataka, India (On-Site)
4 Weeks ago
Notion - Customer Experience (CX) Automation Engineer

Notion

San Francisco, California, United States (On-Site)
1 Month ago
caliogo - Senior Software Engineer 1 - Platform

caliogo

Hyderabad, Telangana, India (On-Site)
4 Months ago
Sourcegraph  Inc  - Senior Solutions Engineer

Sourcegraph Inc

(Remote)
1 Month ago
Netradyne - Site Reliability Engineer (SRE)

Netradyne

Bengaluru, Karnataka, India (On-Site)
8 Months ago
FICO - DevOps Engineering Enablement Lead Engineer

FICO

Bengaluru, Karnataka, India (Hybrid)
1 Year ago
Varonis  - Cloud Security Researcher

Varonis

Herzliya, Tel Aviv District, Israel (On-Site)
8 Months ago
Google - Silicon RTL Design Engineer, TPU, Google Cloud

Google

Bengaluru, Karnataka, India (On-Site)
2 Months ago
bytedance - Software Engineer-Infrastructure Delivery Platform

bytedance

San Jose, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Pune, Maharashtra, India (On-Site)

Taipei City, Taiwan (On-Site)

Beijing, Beijing, China (On-Site)

Santa Clara, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug