Senior DevOps Engineer, Deep Learning Frameworks

1 Month ago • 5 Years + • DevOps

Job Summary

Job Description

NVIDIA's Deep Learning Optimized Frameworks Group seeks a Senior DevOps Engineer to enhance their high-performing deep learning software stacks. Responsibilities include automating build, test, integration, and release processes for frameworks like TensorFlow and PyTorch; configuring and maintaining industry-standard DevOps tools (Gitlab, Jenkins, Docker, etc.); developing shared utilities; leading best practices; and identifying infrastructure needs. The ideal candidate will possess strong experience in CI/CD, SCM, and build systems, along with programming skills in Python (or similar).
Must have:
  • 5+ years relevant experience
  • CI/CD system automation
  • SCM & build systems expertise (Git, CMake, etc.)
  • Python (or Perl/Shell scripting)
  • Problem-solving & collaboration
Good to have:
  • CUDA & Deep Learning Software Stack experience
  • Container & cluster tech (Kubernetes, Jenkins, etc.)
  • GPU computing systems knowledge
  • Experience with new tech incorporation
Perks:
  • Highly competitive salaries
  • Extensive benefits package
  • Diverse and inclusive work environment

Job Details

NVIDIA's Deep Learning Optimized Frameworks Group is looking for an excellent DevOps Engineer to enable the next wave of NVIDIA’s highest performing deep learning software stacks. Your role spans multiple products such as TensorFlow and PyTorch and is instrumental for streamlining development, build, and releases with modern DevOps tools. Join our technically hardworking team of software engineers and infrastructure authorities to design the systems that enable NVIDIA to stay ahead of the competition as we deliver the world's fastest deep learning frameworks.

What you'll be doing:

  • Automating and optimizing build, test, integrate, and release processes for optimized NVIDIA Deep Learning Frameworks

  • Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Gitlab, Jenkins, Docker, LXC, HyperV, CMake, Bazel)

  • Developing shared utilities for setting up systems, running tests, and recording results

  • Lead best-practices for building, testing, and releasing software

  • Identifying infrastructure needs and translating them into action

What we need to see:

  • BS or higher degree in computer science (or equivalent experience)

  • 5+ years of relevant experience

  • Strong experience setting up, maintaining, and automating continuous integration systems

  • Fluency in SCM (e.g. Github, Gitlab, Git) and build systems (e.g. Make, CMake, Bazel, Docker)

  • Adept programming skills in Python (or Perl, Shell scripting, like bash, tcsh, sh)

  • Pragmatic approach to solving problems and collaboration

  • Real passion for “it just works” automation and enabling team members

Ways to stand out from the crowd:

  • Experience with CUDA and Deep Learning Software Stack

  • Good knowledge of container and cluster technologies like slurm, kubernetes, jenkins, gitlab-ci, and zabbix

  • Experience with GPU computing systems

  • Track record of identifying useful new technologies and incorporating them into SW development flows

  • Experience as an active contributor to a SW project involving many developers

NVIDIA is at the forefront of breakthroughs in Artificial Intelligence, High-Performance Computing, and Visualization. Our teams are composed of driven, innovative professionals dedicated to pushing the boundaries of technology. We offer highly competitive salaries, an extensive benefits package, and a work environment that promotes diversity, inclusion, and flexibility. As an equal opportunity employer, we are committed to fostering a supportive and empowering workplace for all.

Similar Jobs

Stupa Sports Analytics - Computer Vision Engineer

Stupa Sports Analytics

Gurugram, Haryana, India (On-Site)
4 Months ago
Microsoft - Research Intern - Machine Learning

Microsoft

Redmond, Washington, United States (On-Site)
1 Month ago
NVIDIA - Senior Math Libraries Engineer - Dense Linear Algebra

NVIDIA

California, United States (Hybrid)
1 Month ago
Microsoft - Research Intern - Applied Sciences Group (Computer Vision)

Microsoft

Redmond, Washington, United States (On-Site)
1 Month ago
Onward Search - API Developer

Onward Search

North Arlington, New Jersey, United States (Remote)
5 Days ago
Rackspace Technology - AWS Migration Engineer

Rackspace Technology

India (Remote)
5 Days ago
SmileGate - SRE Platform Development Lead

SmileGate

Seongnam-si, Gyeonggi-do, South Korea (On-Site)
1 Month ago
Tencent - Senior Site Reliability Engineer

Tencent

Shanghai, Shanghai, China (On-Site)
5 Months ago
Inworld AI - Staff Platform Engineer  - Canada

Inworld AI

Vancouver, British Columbia, Canada (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

NVIDIA - Solutions Architect, Generative AI

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
ByteDance - Research Scientist Graduate (Foundation Model - Generative AI) - 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
2 Months ago
Meta - Research Scientist Intern, Language and Multimodal Research for GenAI (PhD)

Meta

New York, New York, United States (On-Site)
3 Months ago
NVIDIA - Manager, Tools and Development

NVIDIA

Pune, Maharashtra, India (On-Site)
1 Month ago
ByteDance - Video Codec Algorithm Modeling Engineer - Multimedia Lab

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
NVIDIA - Applied Research Intern - 2025

NVIDIA

Yerevan, Yerevan, Armenia (On-Site)
1 Week ago
Microsoft - Senior Applied Scientist- Content Service

Microsoft

Beijing, Beijing, China (On-Site)
1 Month ago
NVIDIA - Senior Reliability Engineer

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
Canva - Machine Learning Engineering Manager (m/f/x) - Canva Austria

Canva

Vienna, Vienna, Austria (Remote)
3 Months ago
GT - ML Architect (with Data Experience)

GT

Ukraine (Remote)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in Warsaw, Masovian Voivodeship, Poland

Playtika - Unity Developer

Playtika

Warsaw, Masovian Voivodeship, Poland (Hybrid)
6 Months ago
Aristocrat Gaming - QA Automation Engineer (C# or Java)

Aristocrat Gaming

Warsaw, Masovian Voivodeship, Poland (Hybrid)
2 Weeks ago
Fool's Theory - Employment & Payroll Specialist

Fool's Theory

Poland (Remote)
4 Weeks ago
Techland - UE5 Senior Game Programmer AI

Techland

Wrocław, Lower Silesian Voivodeship, Poland (On-Site)
2 Months ago
Nielsen Holdings - Director, Compensation

Nielsen Holdings

Warsaw, Masovian Voivodeship, Poland (Hybrid)
2 Months ago
Eleven Labs - FullStack Engineer (Frontend Leaning)

Eleven Labs

Warsaw, Masovian Voivodeship, Poland (Remote)
6 Months ago
Luxoft - Java Senior Software Developer

Luxoft

Kraków, Lesser Poland Voivodeship, Poland (On-Site)
3 Months ago
Netflix - Software Engineer (L4/L5) - Content Engineering

Netflix

Warsaw, Masovian Voivodeship, Poland (Hybrid)
1 Month ago
Nielsen Holdings - Data & Reporting Analyst with German

Nielsen Holdings

Warsaw, Masovian Voivodeship, Poland (Remote)
4 Months ago
Tesla - Service Operations Lead - Poland

Tesla

Ząbki, Masovian Voivodeship, Poland (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Codeninja - Senior Python Developer with DevOps Expertise

Codeninja

Lahore, Punjab, Pakistan (On-Site)
4 Months ago
NVIDIA - GenAI and MLOps Intern - Spring 2025

NVIDIA

Taipei City, Taiwan (On-Site)
3 Weeks ago
Meltwater - Backend & Cloud Engineer – Javascript

Meltwater

Hyderabad, Telangana, India (Hybrid)
4 Months ago
Nintendo - Machine Learning Operations Engineer

Nintendo

Redmond, Washington, United States (On-Site)
1 Week ago
Electronic Arts - [EA Sports FC] DevOps Engineer

Electronic Arts

Seoul, South Korea (On-Site)
3 Months ago
DEVOTEAM - Distributed Cloud | Azure Cloud Architect

DEVOTEAM

Lisbon, Lisbon, Portugal (Remote)
4 Months ago
OpenGov - DevOps Engineer III

OpenGov

Atlanta, Georgia, United States (Hybrid)
4 Months ago
Kwalee - DevOps Engineer

Kwalee

Bengaluru, Karnataka, India (On-Site)
3 Weeks ago
Axon - Manager, Site Reliability Engineering

Axon

Seattle, Washington, United States (Remote)
6 Days ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.


Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Bengaluru, Karnataka, India (On-Site)

Taipei City, Taiwan (On-Site)

Taipei City, Taiwan (On-Site)

Shanghai, Shanghai, China (On-Site)

Shanghai, Shanghai, China (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug