Senior Software Engineer, DGX Cloud Orchestration

1 Month ago • 5-9 Years • DevOps • $136,000 PA - $264,500 PA

Job Summary

Job Description

NVIDIA seeks a Senior Software Engineer to join its DGX Cloud team. This role involves designing and developing scalable automation solutions for high-performance GPU infrastructure, integrating diverse systems, and creating seamless workflows for global cloud operations. Responsibilities include designing APIs (GraphQL/REST), building state management systems, collaborating across teams to codify business processes, developing extensible platforms, integrating with Kubernetes and observability systems, optimizing cloud operations, and leading impactful technical projects. The ideal candidate possesses expertise in building APIs, proficiency in Go, Java, or Python, familiarity with cloud infrastructure (AWS, GCP, Azure), and experience with high-scale distributed systems.
Must have:
  • GraphQL/REST API design & development
  • Go/Java/Python proficiency
  • Cloud infrastructure & Kubernetes expertise
  • High-scale distributed systems experience
  • Workflow orchestration system design
Good to have:
  • Experience reducing operational inefficiencies
  • Strong debugging and problem-solving skills
Perks:
  • Equity
  • Benefits

Job Details

We are looking for a Senior Software Engineer to join our DGX Cloud team and build the foundational systems that drive NVIDIA’s high-performance GPU infrastructure. You will play a critical role in designing scalable automation solutions, integrating diverse systems, and enabling seamless workflows across global cloud operations. NVIDIA is widely recognized as one of the most desirable employers, with some of the most talented people in the world working for us. If you're passionate about building scalable, efficient systems to power cloud operations, we invite you to join our team.

What You'll Be Doing

  • Design and develop APIs (GraphQL/REST) to orchestrate and integrate operational workflows.

  • Build state management and workflow automation systems that streamline infrastructure lifecycle processes.

  • Collaborate across teams to codify business processes into scalable, self-measuring systems.

  • Develop extensible, schema-driven platforms for reducing manual toil and ensuring operational consistency.

  • Drive integrations with container orchestration tools like Kubernetes and observability systems such as Prometheus, OpenTelemetry, Grafana.

  • Optimize the reliability and efficiency of cloud operations through automated workflows and telemetry systems.

  • Lead and ship impactful technical projects, ensuring quality and scalability at every stage

What we need to see:

  • 5-9+ years of industry experience with a Bachelor’s or Master’s degree (or equivalent experience), or 2+ years with a PhD.

  • Expertise in building GraphQL and REST APIs.

  • Proficiency in programming languages such as Go, Java, or Python.

  • Familiarity with modern JavaScript frameworks (e.g., React, Angular, Next.js).

  • Strong understanding of cloud infrastructure (AWS, GCP, Azure) and container technologies like Docker and Kubernetes.

  • Experience with high-scale distributed systems, including architectural patterns for APIs and data pipelines.

  • Outstanding communication and collaboration skills, with a focus on solving complex operational challenges.

  • A passion for automating manual processes and driving system efficiency.

Ways to Stand Out from the Crowd

  • A track record of designing workflow orchestration systems for large-scale infrastructure.

  • Proven experience in reducing operational inefficiencies through automation and integration.

  • Strong debugging and problem-solving skills in distributed environments.

NVIDIA is committed to creating an environment where diverse perspectives drive innovation. As part of the DGX Cloud team, you’ll work on ground breaking technology that powers the future of AI and cloud computing. NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. Our invention serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for great people like you to help us accelerate the next wave of artificial intelligence.

The base salary range is 136,000 USD - 264,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Gamomat - DevOps Engineer

Gamomat

Berlin, Berlin, Germany (Hybrid)
2 Months ago
Microsoft - Technical Support Engineer - Azure Stack Hub

Microsoft

(Remote)
1 Month ago
Veeam Software - Inside Sales Representative

Veeam Software

Phoenix, Arizona, United States (On-Site)
1 Month ago
Activision - Software Development Intern

Activision

Shanghai, Shanghai, China (On-Site)
2 Months ago
ByteDance - Site Reliability Engineer, Edge Services

ByteDance

San Jose, California, United States (On-Site)
1 Month ago
Ubisoft - Back-End Golang Developer

Ubisoft

Montreal, Quebec, Canada (On-Site)
2 Months ago
ION - Cloud Engineer Kubernetes

ION

Collecchio, Emilia-Romagna, Italy (Hybrid)
7 Months ago
Google - Customer Engineer, Google Workspace

Google

İstanbul, İstanbul, Türkiye (On-Site)
1 Month ago
Rackspace Technology - AWS Service Delivery Manager

Rackspace Technology

India (Remote)
2 Months ago
Hedra - Machine Learning Engineer

Hedra

San Francisco, California, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Immutable - Application Security Engineer

Immutable

Sydney, New South Wales, Australia (Hybrid)
3 Months ago
GoDaddy - FullStack Senior Software Development Engineer

GoDaddy

(Remote)
1 Month ago
Canonical - Product Manager

Canonical

(Remote)
1 Month ago
ByteDance - Software Engineer, ML System Scheduling

ByteDance

San Jose, California, United States (On-Site)
7 Months ago
Voodoo - Senior Android Developer

Voodoo

Paris, Île-de-France, France (Hybrid)
4 Months ago
Gala games - Senior DevOps Engineer Contractor

Gala games

Pakistan (On-Site)
2 Months ago
Rackspace Technology - Azure Cloud Engineer

Rackspace Technology

India (Remote)
1 Month ago
Sporty Group - LatAM Site Reliability Engineer

Sporty Group

(On-Site)
1 Year ago
Veeam Software - Inside Sales Representative

Veeam Software

Warsaw, Masovian Voivodeship, Poland (Hybrid)
1 Month ago
Bally's Interactive - Backend Typescript Developer

Bally's Interactive

London, England, United Kingdom (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in California, United States

Axon - Associate IP Counsel, Patents

Axon

Sterling, Virginia, United States (On-Site)
1 Month ago
NVIDIA - Principal Engineer

NVIDIA

United States (Remote)
3 Months ago
IMC - Machine Learning Engineer

IMC

Chicago, Illinois, United States (On-Site)
1 Month ago
Capcom - Manager, Payroll & Benefits

Capcom

California, United States (Hybrid)
2 Months ago
Click Therapeutics - Senior Business Development Manager/Associate Business Development Director

Click Therapeutics

New York, New York, United States (Hybrid)
1 Month ago
Entrata - Business Development Representative

Entrata

Lehi, Utah, United States (On-Site)
1 Year ago
Whoop - Data Scientist I (Women's Health)

Whoop

Boston, Massachusetts, United States (On-Site)
3 Months ago
Critical mass - Group Strategy Director

Critical mass

New York, New York, United States (On-Site)
1 Month ago
Next Level Business Services - MS .Net Software Developer

Next Level Business Services

Orange, California, United States (On-Site)
7 Months ago
Tencent - Senior AI Strategy Researcher

Tencent

California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Google - Senior Software Engineer, Site Reliability Engineering, Technical Infrastructure

Google

Dublin, County Dublin, Ireland (On-Site)
1 Month ago
Inworld AI - Staff Platform Engineer, MLOps

Inworld AI

Mountain View, California, United States (Hybrid)
1 Month ago
Ajmera Infotech - Senior Azure DevOps Engineer (IaaS)

Ajmera Infotech

Hyderabad, Telangana, India (On-Site)
2 Months ago
Tencent - Senior IT Operations Engineer

Tencent

Los Angeles, California, United States (On-Site)
1 Month ago
Google - Software Engineer, Access Intelligence

Google

São Paulo, State Of São Paulo, Brazil (On-Site)
1 Month ago
Easygo - Senior DevOps Engineer

Easygo

Belgrade, Serbia (On-Site)
2 Months ago
ByteDance - Linux System Engineer

ByteDance

London, England, United Kingdom (On-Site)
2 Months ago
Budge Studios - Build Master

Budge Studios

Quebec, Canada (Hybrid)
2 Months ago
Microsoft - Senior Software Engineering Manager

Microsoft

Hyderabad, Telangana, India (On-Site)
1 Month ago
Microsoft - Senior Software Engineer - Azure Storage

Microsoft

(On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Pune, Maharashtra, India (On-Site)

Taipei City, Taiwan (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug