Senior System Software Engineer - MLOps

2 Days ago • 3 Years + • DevOps • $148,000 PA - $287,500 PA

Job Summary

Job Description

NVIDIA seeks a Senior System Software Engineer to contribute to the Triton Inference Server. Responsibilities include building infrastructure solutions, defining CI/CD processes, ensuring cross-platform compatibility, collaborating with cross-functional teams, and optimizing the deployment pipeline. The ideal candidate will possess strong software design skills (Bash, Python, CI/CD), experience with cloud platforms (GitHub), distributed systems programming, and a deep understanding of deep learning frameworks. Experience with Docker and Kubernetes is a plus.
Must have:
  • 3+ years experience in relevant field
  • Excellent Bash, Python, CI/CD skills
  • Experience with GitHub and cloud platforms
  • Knowledge of distributed systems
  • Software design and debugging skills
Good to have:
  • Experience with Docker and Kubernetes
  • Experience designing/architecting systems
  • Contributions to open-source deep learning community
  • Experience with infrastructure as code
Perks:
  • Equity
  • Benefits

Job Details

We are now looking for a Senior System Software Engineer to work on Triton Inference Server! NVIDIA is hiring software engineers for its GPU-accelerated deep learning software team. Academic and commercial groups around the world are using GPUs to power a revolution in deep learning, enabling breakthroughs in problems from image classification to speech recognition to natural language processing. We are a fast-paced team building tools and software to make design and deployment of new deep learning models easier and accessible to more data scientists.

What you'll be doing:

In this role, you will build infrastructure solutions from first principles needed to deliver Triton Inference Server. You will apply software design skills to define the processes and best practices for performing continuous integration, testing, and releasing builds, while ensuring the cross-platform compatibility of Triton Inference Server across a wide range of operating systems and architecture systems. Using your expertise, you will influence how we design our customer facing technology and tools to enable an optimized pipeline for building and deploying our product. Extensive collaboration with cross-functional teams to integrate pipelines from deep learning frameworks and components is essential to ensuring seamless deployment and inference of deep learning models across Triton Inference Server.

What we need to see:

  • Masters degree or equivalent experience

  • 3+ years of experience in Computer Science, computer architecture, or related field

  • Ability to work in a fast-paced, agile team environment

  • Excellent Bash, CI/CD, Python programming and software design skills, including debugging, performance analysis, and test design. 

  • Experience in administering, monitoring, and deploying systems and services on GitHub and cloud platforms. Support other technical teams in monitoring operating efficiencies of the platform, and responding as needs arise.

  • Knowledge of distributed systems programming.

Ways to stand out from the crowd:

  • Experience designing or architecting (design patterns, reliability and scaling) of new and existing systems experience.

  • Experience driving efficiencies in software architecture, creating metrics, implementing infrastructure as code and other automation improvements.

  • Background deploying cloud-native services using modern technologies such as Docker, and Kubernetes, optimizing software for scalable and efficient deployment in cloud environments.

  • Experience contributing to a large open-source deep learning community - use of GitHub, bug tracking, branching and merging code, OSS licensing issues handling patches, etc.

  • Excellent problem solving abilities spanning multiple software (storage systems, kernels and containers) as well as collaborating within an agile team environment to prioritize deep learning-specific features and capabilities within Triton Inference Server, employing advanced troubleshooting and debugging techniques to resolve complex technical issues.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most experienced and hard-working people in the world working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you. Come help us build the real-time, efficient computing platform driving our success in the dynamic and quickly growing field Deep Learning and Artificial Intelligence.

#LI-Hybrid 

The base salary range is 148,000 USD - 287,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

NVIDIA - US Indirect Tax Manager

NVIDIA

Canada (On-Site)
1 Month ago
ByteDance - High-Performance Computing Research Scientist (Algorithm Acceleration)

ByteDance

San Jose, California, United States (On-Site)
1 Month ago
Genies - Lead Applied ML Engineer, Real-time 3D Asset Optimization

Genies

Los Angeles, California, United States (On-Site)
4 Weeks ago
Meta - Research Scientist Intern, Smart Glasses in Wearables AI (PhD)

Meta

Washington, District Of Columbia, United States (On-Site)
4 Months ago
NVIDIA - Design Verification Engineer - PCIE

NVIDIA

Taipei City, Taiwan (On-Site)
1 Week ago
Epic Games - Senior DevOps Programmer

Epic Games

London, England, United Kingdom (On-Site)
1 Month ago
ION - Cloud Engineer/Architect (DevOps)

ION

Pisa, Tuscany, Italy (On-Site)
5 Months ago
Luxoft - Senior Software Support Engineer

Luxoft

Italy, New York, United States (Remote)
4 Months ago
Granicus - Sr. DevOps Engineer

Granicus

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
Twitch - Software Development Engineer

Twitch

San Francisco, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Avathon - Data Scientist

Avathon

Bengaluru, Karnataka, India (On-Site)
5 Months ago
NVIDIA - Senior DFT Engineer

NVIDIA

Santa Clara, California, United States (Hybrid)
3 Weeks ago
NVIDIA - Senior Physical Design Engineer

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
NVIDIA - Senior Math Libraries Engineers - Python APIs

NVIDIA

Remote, Oregon, United States (Remote)
1 Month ago
NVIDIA - Senior Mixed Signal and Analog Circuit Designer

NVIDIA

Taipei City, Taiwan (On-Site)
1 Week ago
NVIDIA - Distinguished Software Architect - Deep Learning and HPC Communications

NVIDIA

Santa Clara, California, United States (On-Site)
2 Months ago
NVIDIA - Signal and Power Integrity Engineer (RDSS Intern)

NVIDIA

Taipei City, Taiwan (On-Site)
2 Months ago
ByteDance - High-Performance Computing Research Scientist (Algorithm Acceleration)

ByteDance

Seattle, Washington, United States (On-Site)
1 Month ago
Arrise Solutions (India)   - Senior Data Scientist (Remote)

Arrise Solutions (India)

Hyderabad, Telangana, India (Remote)
5 Months ago
ByteDance - Video Analysis and Quality Algorithm Intern 2023 Summer/Fall (PHD)

ByteDance

San Diego, California, United States (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

Jobs in California, United States

Light Speed Studios - Senior Gameplay Engineer

Light Speed Studios

Los Angeles, California, United States (On-Site)
1 Month ago
The Walt Disney Company - Software Engineer II

The Walt Disney Company

Glendale, California, United States (On-Site)
1 Month ago
Spellbrush - LLM Engineer

Spellbrush

San Francisco, California, United States (On-Site)
1 Week ago
Meta - Network Production Engineer, Network Infrastructure

Meta

Boston, Massachusetts, United States (On-Site)
4 Months ago
Snail Games - Bilingual Translator & Executive Assistant (English/Mandarin)

Snail Games

Beverly Hills, California, United States (On-Site)
4 Months ago
PlayStation Global - Senior Program Manager, Account & Identity

PlayStation Global

California, United States (On-Site)
2 Days ago
The Walt Disney Company - Manager, Software Engineer - Video Playback

The Walt Disney Company

New York, New York, United States (On-Site)
2 Months ago
Tribe Gaming - Partnership Accounts Intern

Tribe Gaming

Austin, Texas, United States (Hybrid)
5 Months ago
Microsoft - Principal Product Manager, AI

Microsoft

Redmond, Washington, United States (Hybrid)
3 Days ago
Netflix - Software Engineer (L5) - Experimentation Platform

Netflix

Los Gatos, California, United States (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Sandsoft Games - DevOps & Automation Engineer

Sandsoft Games

Riyadh, Riyadh Province, Saudi Arabia (Hybrid)
2 Days ago
Rackspace Technology - AWS Support Engineer L2

Rackspace Technology

Gurugram, Haryana, India (Remote)
3 Weeks ago
Rackspace Technology - Cloud Database Engineer III/IV

Rackspace Technology

Gurugram, Haryana, India (Remote)
1 Day ago
Gaming Innovation Group  - System Administrator

Gaming Innovation Group

Catalonia, Spain (Hybrid)
2 Days ago
NVIDIA - Senior Site Reliability Engineer - GPU Clusters

NVIDIA

Santa Clara, California, United States (On-Site)
2 Months ago
Gearbox Software - Senior Site Reliability Engineer

Gearbox Software

Frisco, Texas, United States (On-Site)
3 Months ago
Epic Games - Senior DevOps Programmer

Epic Games

Montreal, Quebec, Canada (On-Site)
3 Days ago
Twitch - Software Development Engineer

Twitch

San Francisco, California, United States (On-Site)
1 Month ago
Applike Group - DevOps Engineer (f/m/d)

Applike Group

Hamburg, Hamburg, Germany (Hybrid)
5 Months ago
Sonar Source - Support Engineer

Sonar Source

Geneva, Geneva, Switzerland (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.


Yokne'am Illit, North District, Israel (On-Site)

Hyderabad, Telangana, India (On-Site)

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug