Senior DevOps Engineer, Deep Learning Frameworks

1 Month ago • 5 Years + • DevOps

Job Summary

Job Description

NVIDIA's Deep Learning Optimized Frameworks Group seeks a Senior DevOps Engineer to enhance their high-performing deep learning software stacks. Responsibilities include automating build, test, integration, and release processes for frameworks like TensorFlow and PyTorch; configuring and maintaining industry-standard DevOps tools (Gitlab, Jenkins, Docker, etc.); developing shared utilities; leading best practices; and identifying infrastructure needs. The ideal candidate will possess strong experience in CI/CD, SCM, and build systems, along with programming skills in Python (or similar).
Must have:
  • 5+ years relevant experience
  • CI/CD system automation
  • SCM & build systems expertise (Git, CMake, etc.)
  • Python (or Perl/Shell scripting)
  • Problem-solving & collaboration
Good to have:
  • CUDA & Deep Learning Software Stack experience
  • Container & cluster tech (Kubernetes, Jenkins, etc.)
  • GPU computing systems knowledge
  • Experience with new tech incorporation
Perks:
  • Highly competitive salaries
  • Extensive benefits package
  • Diverse and inclusive work environment

Job Details

NVIDIA's Deep Learning Optimized Frameworks Group is looking for an excellent DevOps Engineer to enable the next wave of NVIDIA’s highest performing deep learning software stacks. Your role spans multiple products such as TensorFlow and PyTorch and is instrumental for streamlining development, build, and releases with modern DevOps tools. Join our technically hardworking team of software engineers and infrastructure authorities to design the systems that enable NVIDIA to stay ahead of the competition as we deliver the world's fastest deep learning frameworks.

What you'll be doing:

  • Automating and optimizing build, test, integrate, and release processes for optimized NVIDIA Deep Learning Frameworks

  • Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Gitlab, Jenkins, Docker, LXC, HyperV, CMake, Bazel)

  • Developing shared utilities for setting up systems, running tests, and recording results

  • Lead best-practices for building, testing, and releasing software

  • Identifying infrastructure needs and translating them into action

What we need to see:

  • BS or higher degree in computer science (or equivalent experience)

  • 5+ years of relevant experience

  • Strong experience setting up, maintaining, and automating continuous integration systems

  • Fluency in SCM (e.g. Github, Gitlab, Git) and build systems (e.g. Make, CMake, Bazel, Docker)

  • Adept programming skills in Python (or Perl, Shell scripting, like bash, tcsh, sh)

  • Pragmatic approach to solving problems and collaboration

  • Real passion for “it just works” automation and enabling team members

Ways to stand out from the crowd:

  • Experience with CUDA and Deep Learning Software Stack

  • Good knowledge of container and cluster technologies like slurm, kubernetes, jenkins, gitlab-ci, and zabbix

  • Experience with GPU computing systems

  • Track record of identifying useful new technologies and incorporating them into SW development flows

  • Experience as an active contributor to a SW project involving many developers

NVIDIA is at the forefront of breakthroughs in Artificial Intelligence, High-Performance Computing, and Visualization. Our teams are composed of driven, innovative professionals dedicated to pushing the boundaries of technology. We offer highly competitive salaries, an extensive benefits package, and a work environment that promotes diversity, inclusion, and flexibility. As an equal opportunity employer, we are committed to fostering a supportive and empowering workplace for all.

Similar Jobs

Balbix - Staff AI Engineer

Balbix

Bengaluru, Karnataka, India (On-Site)
4 Months ago
InMobiInMobi - Data Scientist III

InMobiInMobi

Bengaluru, Karnataka, India (On-Site)
3 Weeks ago
ByteDance - AI Research Engineer, Large Language Model (Applied Machine Learning)

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
Scale AI - Machine Learning Engineer, International Public Sector

Scale AI

United Kingdom (On-Site)
4 Months ago
The Walt Disney Company - Lead Software Engineer, Scala

The Walt Disney Company

Seattle, Washington, United States (On-Site)
1 Month ago
Patterned Learning Career - Lead Python AWS Developer

Patterned Learning Career

(Remote)
5 Days ago
ByteDance - SRE and DevOps Tech Lead - Edge Cloud Infrastructure - London

ByteDance

London, England, United Kingdom (On-Site)
3 Months ago
Nagarro - Associate Staff Engineer - Cloud Infrastructure

Nagarro

Colombia (Remote)
1 Week ago
Bounteous - Manager Cloud Infrastructure Engineering - BOT

Bounteous

India (Remote)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ByteDance - Software Engineer Intern (Doubao (Seed) - Machine Learning System) - 2025 Summer (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
3 Months ago
NVIDIA - Compute Architect Intern - 2025

NVIDIA

Beijing, Beijing, China (On-Site)
1 Month ago
The Walt Disney Company - Principal Machine Learning Engineer

The Walt Disney Company

San Francisco, California, United States (On-Site)
2 Months ago
NVIDIA - Performance Engineer Intern, Deep Learning and HPC

NVIDIA

Shanghai, Shanghai, China (On-Site)
1 Month ago
Trend Micro - Sr. Data Scientist (AI Lab)

Trend Micro

Taipei City, Taiwan (On-Site)
4 Months ago
PlayStation Global - Senior Data Scientist

PlayStation Global

Carlsbad, California, United States (Remote)
1 Week ago
Canva - Senior Engineering Manager - Ingredient Generation (GenAI)

Canva

Sydney, New South Wales, Australia (Hybrid)
2 Months ago
Airlab Inc  - C++ & Python Programmer

Airlab Inc

Montreal, Quebec, Canada (On-Site)
7 Months ago
Meta - Software Engineer (Leadership) - Machine Learning

Meta

Redmond, Washington, United States (Remote)
3 Months ago
Onward Search - API Developer

Onward Search

North Arlington, New Jersey, United States (Remote)
2 Days ago

Get notifed when new similar jobs are uploaded

Jobs in Warsaw, Masovian Voivodeship, Poland

Easy Brain - Mobile QA Engineer

Easy Brain

Warsaw, Masovian Voivodeship, Poland (Hybrid)
1 Week ago
Techland - Game Programmer

Techland

Wrocław, Lower Silesian Voivodeship, Poland (On-Site)
2 Months ago
PwC - Front-End Developer (Angular) with German (freelance)

PwC

Warsaw, Masovian Voivodeship, Poland (On-Site)
4 Months ago
Netflix - Software Engineer (L4/L5) - Content Engineering

Netflix

Warsaw, Masovian Voivodeship, Poland (Hybrid)
1 Month ago
11 bit studios - Graphic Designer

11 bit studios

Warsaw, Masovian Voivodeship, Poland (Hybrid)
3 Weeks ago
CD PROJEKT RED - Senior Graphic Designer

CD PROJEKT RED

Warsaw, Masovian Voivodeship, Poland (Hybrid)
2 Weeks ago
PwC - ServiceNow Developer/Consultant

PwC

Warsaw, Masovian Voivodeship, Poland (Hybrid)
1 Month ago
Luxoft - Senior .Net Services Developer

Luxoft

Kraków, Lesser Poland Voivodeship, Poland (On-Site)
3 Months ago
CD PROJEKT RED - Material Artist

CD PROJEKT RED

Warsaw, Masovian Voivodeship, Poland (Remote)
1 Month ago
Netflix - Engineering Manager, Content Promotion & Distribution

Netflix

Warsaw, Masovian Voivodeship, Poland (On-Site)
3 Weeks ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

HP - Senior Technical Lead - MS Dynamics

HP

Bengaluru, Karnataka, India (On-Site)
5 Months ago
Dentsu - Senior Integration Developer

Dentsu

Pune, Maharashtra, India (On-Site)
4 Months ago
Luxoft - Orchestrade - Azure infrastructure cloud Regular engineer

Luxoft

Poland, Ohio, United States (Remote)
3 Months ago
Balbix - Staff /Sr Staff/ Principal Engineer - Lakehouse

Balbix

Gurugram, Haryana, India (On-Site)
4 Months ago
Microsoft - Software Engineer 2

Microsoft

Ho Chi Minh City, Ho Chi Minh City, Vietnam (On-Site)
1 Month ago
DEVOTEAM - Distributed Cloud l Google Data Project

DEVOTEAM

Lisbon, Lisbon, Portugal (Remote)
4 Months ago
Unity - Senior Software Engineer, Data & ML Infrastructure

Unity

San Francisco, California, United States (On-Site)
4 Months ago
Razer - Lead Site Reliability Engineer

Razer

Shanghai, Shanghai, China (On-Site)
4 Months ago
Nagarro - Associate Principal Engineer, DevOps

Nagarro

India (Remote)
4 Months ago
Microsoft - Sr. FastTrack Solution Architect

Microsoft

Hyderabad, Telangana, India (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.


Yokne'am Illit, North District, Israel (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (On-Site)

United States (Remote)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Bengaluru, Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug