Senior Site Reliability Engineer - Infrastructure

2 Months ago • 5 Years + • DevOps • $148,000 PA - $287,500 PA

Job Summary

Job Description

As a Senior Site Reliability Engineer at NVIDIA, you will collaborate with various teams to enhance the infrastructure environment supporting the development of innovative chips. Responsibilities include developing automation for scalable infrastructure, implementing infrastructure innovations, designing network architecture and storage solutions, working closely with EDA teams, and investigating complex problems. You will contribute to improving the chip development process, enhancing quality, and reducing time to market. The role requires expertise in automation workflows, UNIX systems, and distributed systems concepts. Experience with technologies like Ansible, Jenkins, Python, and various UNIX utilities is essential.
Must have:
  • Automation (Ansible, Jenkins)
  • UNIX Systems Programming
  • Python experience
  • Networking & Storage Expertise
  • Distributed Systems Knowledge
  • Problem Solving & Debugging
Good to have:
  • IBM Spectrum LSF/SLURM
  • Perl
  • Chip Design Workflow Knowledge
  • Security & Productivity Solutions
Perks:
  • Equity
  • Benefits

Job Details

NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by seeking new opportunities that are hard to solve, that only we can address, and that matter to the world. This is our life’s work, to amplify human creativity and intelligence. Make the choice to join us today!


As an SRE or with equivalent experience, you'll collaborate with various teams to improve our infrastructure environment within NVIDIA's Hardware Infrastructure team. You will enable our engineers to have the best environment on the planet to make the most innovative chips in the world. You will work with your team of EDA and software experts to build new infrastructure in an agile environment. You will continuously innovate and improve scalable, reliable, high performance systems and tools to enable the next generation of chips!


What you’ll be doing:

  • Develop automation in order to scale infrastructure easily and reliably.

  • Use broad IT infrastructure skills to implement infrastructure innovations which accelerate chip development.

  • Design and implement network architecture, storage solutions, virtualization, and services specific to EDA workflows.

  • Work closely with EDA teams to understand their requirements and translate them into infrastructure solutions.

  • Work in a diverse team performing fast paced investigations to empower engineers to develop at the speed of light.

  • Collaborate to improve how our chip development process utilizes our infrastructure.

  • Directly contribute to the overall quality and improve time to market for our next generation chips.


What we need to see:

  • Experience with automation workflows such as Ansible and Jenkins.

  • UNIX Systems programming and automation using industry standard languages and familiar with API calls. Python experience preferred.

  • Authoritative level usage of UNIX and UNIX CLI utilities such as sed, awk, grep.

  • Hands on experience with architectural decisions in technologies (storage, networking, compute) our chip engineers depend on.

  • Understanding of distributed UNIX system concepts such as NFS, autofs, DNS, LDAP and/or NIS.

  • Excellent planning and communication skills and a passion for improving the productivity and efficiency of other specialists.

  • Strong experience investigating and debugging complex, multi-discipline problems in a UNIX environment.

  • 5+ years experience in a large, distributed UNIX environment.

  • History of using data analysis principles and influencing data-driven decisions.

  • MS (preferred) or BS in Computer Science, similar degree or equivalent experience.


Ways to stand out from the crowd:

  • Extensive knowledge with job schedulers (in particular IBM Spectrum LSF and/or SLURM).

  • Experience with perl.

  • Deep understanding of distributed system principles.

  • Experience with chip design workflows, such as front end verification, back end workflows, or mixed signal workflows.

  • Experience in crafting solutions that balance security and productivity for the end user.

The base salary range is 148,000 USD - 287,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Nagarro - Staff Engineer, Java Fullstack

Nagarro

Mumbai, Maharashtra, India (On-Site)
5 Months ago
N-iX - SENIOR FULL STACK ENGINEER (JAVA+REACT) (#2720)

N-iX

Ukraine (Remote)
2 Months ago
Lockwood - Cloud Engineer

Lockwood

United Kingdom (Remote)
1 Week ago
PwC - IN-Senior Associate_React Developer_Data &Analytics_Advisory_PAN India

PwC

Bengaluru, Karnataka, India (On-Site)
5 Months ago
PlayStation Global - Mid-Career Machine Learning Engineer - Recommendation Systems

PlayStation Global

San Francisco, California, United States (On-Site)
6 Days ago
ByteDance - Senior Software Engineer - Serverless Compute Infrastructure

ByteDance

Seattle, Washington, United States (On-Site)
1 Month ago
Netflix - Distributed Systems Engineer (L5) - Infra Control Planes

Netflix

Los Gatos, California, United States (On-Site)
5 Months ago
Playtech - Platform Engineer

Playtech

London, England, United Kingdom (On_site)
2 Months ago
Rackspace Technology - Google Cloud Engineer IV

Rackspace Technology

United States (Remote)
2 Months ago
RoofStack - Head of Software Development

RoofStack

İstanbul, İstanbul, Türkiye (On-Site)
5 Days ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

PwC - IN_Associate_Azure Cloud Data Engineer_OneCloud _Advisory _Bangalore

PwC

Gurugram, Haryana, India (On-Site)
4 Months ago
NVIDIA - Senior Firmware Verification Engineer, PCIe

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Weeks ago
Aristocrat Gaming - Integrations Manager

Aristocrat Gaming

Sofia, Sofia City Province, Bulgaria (Hybrid)
3 Months ago
N-iX - Senior Full-Stack Engineer

N-iX

Slovakia (Flexible)
1 Week ago
Ethernovia - GUI Tools Software Development

Ethernovia

Pune, Maharashtra, India (Remote)
5 Months ago
Nagarro - Associate Staff Engineer, Java

Nagarro

India (Remote)
5 Months ago
Ness Digital - Java & React Engineer II

Ness Digital

Timișoara, Timiș, Romania (Remote)
6 Days ago
ION - Senior C++ Developer, Italy

ION

Milan, Lombardy, Italy (On-Site)
5 Months ago
Playrix - Senior QA Engineer (VSO Engine)

Playrix

Ireland (Remote)
1 Week ago
Tesla - Data Scientist

Tesla

Brandenburg, Germany (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Durham, North Carolina, United States

The Walt Disney Company - Lead Machine Learning Engineer

The Walt Disney Company

Seattle, Washington, United States (On-Site)
1 Week ago
NVIDIA - Senior Systems Software Engineer, Containers and Kubernetes

NVIDIA

California, United States (Remote)
4 Days ago
The Walt Disney Company - Lead Software Engineer in Test, iOS/Android

The Walt Disney Company

Glendale, California, United States (On-Site)
4 Months ago
Axon - Senior Enterprise Account Executive

Axon

Chicago, Illinois, United States (Remote)
1 Month ago
Regent Craft - Seaglider Test Engineering Intern

Regent Craft

North Kingstown, Rhode Island, United States (On-Site)
5 Months ago
ByteDance - Facilities Lead, AMS

ByteDance

San Jose, California, United States (On-Site)
6 Days ago
On Location - Manager, B2B Marketing – FIFA World Cup 26™

On Location

New York, New York, United States (On-Site)
1 Week ago
Trek - Assistant Store Manager

Trek

Columbia, Maryland, United States (On-Site)
1 Month ago
IGT - Senior Director, System Design

IGT

Las Vegas, Nevada, United States (Hybrid)
3 Months ago
Obsidian Entertainment - Graphics Programmer (Staff/Senior)

Obsidian Entertainment

Irvine, California, United States (On-Site)
7 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Velotio Technologies - Software Engineer (Data Engineering)

Velotio Technologies

Maharashtra, India (Remote)
5 Days ago
Ajmera Infotech - Senior Azure DevOps Engineer (IaaS)

Ajmera Infotech

Hyderabad, Telangana, India (On-Site)
1 Week ago
PwC - Azure Cloud Solutions Architect, Senior Manager

PwC

Toronto, Ontario, Canada (On-Site)
3 Months ago
N-iX - Senior DevOps Engineer

N-iX

India (Remote)
1 Month ago
ION - Cloud Engineer Kubernetes

ION

Castellazzo Bormida, Piedmont, Italy (Hybrid)
5 Months ago
bosh group india - Technical Consultant

bosh group india

Bengaluru, Karnataka, India (On-Site)
3 Months ago
SYBO - Build and Release Engineering Intern

SYBO

Copenhagen, Denmark (On-Site)
1 Month ago
Info Stretch - .Net Architect

Info Stretch

Philadelphia, Pennsylvania, United States (On-Site)
4 Months ago
Ness Digital - DevOps Engineer

Ness Digital

Timișoara, Timiș, Romania (Hybrid)
2 Months ago
The Walt Disney Company - Senior Real Time Pipeline Engineer (PH)

The Walt Disney Company

Glendale, California, United States (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.


Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Ra'anana, Center District, Israel (On-Site)

Ra'anana, Center District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug