Senior Site Reliability Engineer - Infrastructure

1 Month ago • 5 Years + • DevOps • $148,000 PA - $287,500 PA

Job Summary

Job Description

As a Senior Site Reliability Engineer at NVIDIA, you will collaborate with various teams to enhance the infrastructure supporting the development of innovative chips. Responsibilities include developing automation for scalable infrastructure, implementing infrastructure innovations using broad IT skills (network architecture, storage, virtualization), working closely with EDA teams to translate requirements into solutions, and performing fast-paced investigations to accelerate chip development. You will contribute to improving the overall quality and time to market for next-generation chips.
Must have:
  • Automation (Ansible, Jenkins)
  • UNIX Systems Programming
  • Python experience
  • UNIX CLI expertise (sed, awk, grep)
  • Storage, Networking, Compute architecture
  • Distributed UNIX systems knowledge (NFS, autofs, DNS, LDAP)
  • Problem-solving and debugging skills
Good to have:
  • IBM Spectrum LSF/SLURM experience
  • Perl experience
  • Distributed system expertise
  • Chip design workflow knowledge
  • Security/productivity solution design
Perks:
  • Equity
  • Benefits

Job Details

NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by seeking new opportunities that are hard to solve, that only we can address, and that matter to the world. This is our life’s work, to amplify human creativity and intelligence. Make the choice to join us today!


As an SRE or with equivalent experience, you'll collaborate with various teams to improve our infrastructure environment within NVIDIA's Hardware Infrastructure team. You will enable our engineers to have the best environment on the planet to make the most innovative chips in the world. You will work with your team of EDA and software experts to build new infrastructure in an agile environment. You will continuously innovate and improve scalable, reliable, high performance systems and tools to enable the next generation of chips!


What you’ll be doing:

  • Develop automation in order to scale infrastructure easily and reliably.

  • Use broad IT infrastructure skills to implement infrastructure innovations which accelerate chip development.

  • Design and implement network architecture, storage solutions, virtualization, and services specific to EDA workflows.

  • Work closely with EDA teams to understand their requirements and translate them into infrastructure solutions.

  • Work in a diverse team performing fast paced investigations to empower engineers to develop at the speed of light.

  • Collaborate to improve how our chip development process utilizes our infrastructure.

  • Directly contribute to the overall quality and improve time to market for our next generation chips.


What we need to see:

  • Experience with automation workflows such as Ansible and Jenkins.

  • UNIX Systems programming and automation using industry standard languages and familiar with API calls. Python experience preferred.

  • Authoritative level usage of UNIX and UNIX CLI utilities such as sed, awk, grep.

  • Hands on experience with architectural decisions in technologies (storage, networking, compute) our chip engineers depend on.

  • Understanding of distributed UNIX system concepts such as NFS, autofs, DNS, LDAP and/or NIS.

  • Excellent planning and communication skills and a passion for improving the productivity and efficiency of other specialists.

  • Strong experience investigating and debugging complex, multi-discipline problems in a UNIX environment.

  • 5+ years experience in a large, distributed UNIX environment.

  • History of using data analysis principles and influencing data-driven decisions.

  • MS (preferred) or BS in Computer Science, similar degree or equivalent experience.


Ways to stand out from the crowd:

  • Extensive knowledge with job schedulers (in particular IBM Spectrum LSF and/or SLURM).

  • Experience with perl.

  • Deep understanding of distributed system principles.

  • Experience with chip design workflows, such as front end verification, back end workflows, or mixed signal workflows.

  • Experience in crafting solutions that balance security and productivity for the end user.

The base salary range is 148,000 USD - 287,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Ness Digital - Kubernetes Architect

Ness Digital

United States (Remote)
2 Weeks ago
Sporty Group - Technical Director

Sporty Group

(Remote)
4 Months ago
ION - Senior C++ Developer, Italy

ION

Turin, Piedmont, Italy (On-Site)
6 Months ago
AGS - American Gaming Systems - Senior Software Engineer

AGS - American Gaming Systems

Israel (On-Site)
1 Month ago
Flexera - Member Technical Staff - Site Reliability Engineer

Flexera

Bengaluru, Karnataka, India (Hybrid)
7 Months ago
Beyond Sports  - Unity Developer

Beyond Sports

Alkmaar, North Holland, Netherlands (On-Site)
3 Weeks ago
Scale AI - Software Engineer, Cloud Infrastructure

Scale AI

San Francisco, California, United States (On-Site)
6 Months ago
Canva - Senior Platform Engineer - Workload Integration

Canva

Surry Hills, New South Wales, Australia (Remote)
4 Weeks ago
Tesla - Software Distributed Systems Engineer

Tesla

North Holland, Netherlands (On-Site)
2 Months ago
Next Level Business Services - DevOps Consultant

Next Level Business Services

San Diego, California, United States (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Sony Interactive Entertainment - Senior Cloud Security Engineer

Sony Interactive Entertainment

Tokyo, Japan (On-Site)
4 Months ago
Rockstar Games - Full Stack Developer

Rockstar Games

New York, New York, United States (On-Site)
2 Weeks ago
Next Level Business Services - Java Script Developer (Sr UI Developer with very Strong Exp in Java Script )

Next Level Business Services

Dallas, Texas, United States (On-Site)
6 Months ago
Studio Wildcard - Senior Engine Programmer

Studio Wildcard

Bellevue, Washington, United States (Remote)
4 Weeks ago
Gaming Innovation Group  - Infrastructure Engineer

Gaming Innovation Group

Andalusia, Spain (Hybrid)
4 Weeks ago
GT - Full-stack Engineer (Python + React.js)

GT

Poland (Remote)
4 Weeks ago
ION - Cloud Engineer Kubernetes

ION

Rome, Lazio, Italy (Hybrid)
6 Months ago
Playtika - Senior Data/AI SRE Engineer

Playtika

Ukraine (On-Site)
5 Months ago
Flying Bark Productions - Pipeline TD

Flying Bark Productions

New South Wales, Australia (Hybrid)
4 Weeks ago
Arrise Solutions (India)   - Senior UI Developer

Arrise Solutions (India)

Hyderabad, Telangana, India (On-Site)
7 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Austin, Texas, United States

CharacterAI - Social Media Manager

CharacterAI

Menlo Park, California, United States (Remote)
2 Weeks ago
Nintendo - Intern - Internal Audit

Nintendo

Redmond, Washington, United States (On-Site)
5 Months ago
ByteDance - Software Engineer Intern (CDN/Edge/Traffic Platform)

ByteDance

San Jose, California, United States (On-Site)
4 Weeks ago
Sinch - Enterprise Account Executive

Sinch

United States (Remote)
1 Week ago
ByteDance - Senior Software Engineer - IaaS AI Infra

ByteDance

San Jose, California, United States (On-Site)
1 Week ago
PlayStation Global - Lead Designer

PlayStation Global

California, United States (On-Site)
3 Weeks ago
Riot Games - Senior Manager, Game Product Management - Unpublished R&D Product

Riot Games

Los Angeles, California, United States (On-Site)
1 Month ago
ByteDance - Research Engineer Graduate (Machine Learning Sys-US) - 2024 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
Santa Monica Studio - Character Artist (External Development)

Santa Monica Studio

Los Angeles, California, United States (On-Site)
3 Weeks ago
Scientific Games  - Digital Project Manager

Scientific Games

Alpharetta, Georgia, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Google - Senior Software Engineer, Change Management, Google Cloud

Google

Dublin, County Dublin, Ireland (On-Site)
6 Days ago
Normalyze - Lead DevOps Engineer - Enterprise Cybersecurity - SaaS - Bay Area, CA

Normalyze

California, United States (Remote)
6 Months ago
Canva - Senior Software Engineer - Cloud Security & Compliance, remote across ANZ

Canva

Sydney, New South Wales, Australia (Remote)
4 Months ago
Warner Bros Games - Software Engineer II - DevOps

Warner Bros Games

Bengaluru, Karnataka, India (Hybrid)
3 Weeks ago
Google - Site Reliability Manager, Platforms and Devices, SRE

Google

Bengaluru, Karnataka, India (On-Site)
1 Week ago
Zazz - Data Engineer (6–8 Years) Adhoc

Zazz

India (On-Site)
4 Months ago
Google - Senior Software Engineer, Google Cloud Compute

Google

Seattle, Washington, United States (On-Site)
1 Week ago
ByteDance - Linux System Engineer

ByteDance

London, England, United Kingdom (On-Site)
1 Month ago
Ajmera Infotech - SENIOR ASP.NET DEVELOPER

Ajmera Infotech

Bengaluru, Karnataka, India (On-Site)
9 Months ago
Wargaming - Infrastructure Engineer

Wargaming

Warsaw, Masovian Voivodeship, Poland (Hybrid)
6 Days ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug