Senior Software Engineer, DGX Cloud Orchestration

1 Month ago • 5-9 Years • DevOps • $136,000 PA - $264,500 PA

Job Summary

Job Description

As a Senior Software Engineer in the DGX Cloud Orchestration team, you'll build foundational systems for NVIDIA's high-performance GPU infrastructure. Responsibilities include designing and developing scalable automation solutions (GraphQL/REST APIs), integrating diverse systems, and streamlining infrastructure lifecycle processes using Kubernetes and other tools. You'll collaborate across teams, codify business processes, and optimize cloud operations for reliability and efficiency. The role requires expertise in API development, cloud infrastructure (AWS, GCP, Azure), container technologies, and distributed systems.
Must have:
  • GraphQL/REST API design & development
  • Go, Java, or Python proficiency
  • Cloud infrastructure (AWS, GCP, Azure)
  • Kubernetes & container technologies
  • High-scale distributed systems experience
Good to have:
  • Modern JavaScript frameworks (React, Angular, Next.js)
  • Experience with workflow orchestration systems
  • Proven experience in reducing operational inefficiencies
Perks:
  • Equity
  • Benefits

Job Details

We are looking for a Senior Software Engineer to join our DGX Cloud team and build the foundational systems that drive NVIDIA’s high-performance GPU infrastructure. You will play a critical role in designing scalable automation solutions, integrating diverse systems, and enabling seamless workflows across global cloud operations. NVIDIA is widely recognized as one of the most desirable employers, with some of the most talented people in the world working for us. If you're passionate about building scalable, efficient systems to power cloud operations, we invite you to join our team.

What You'll Be Doing

  • Design and develop APIs (GraphQL/REST) to orchestrate and integrate operational workflows.

  • Build state management and workflow automation systems that streamline infrastructure lifecycle processes.

  • Collaborate across teams to codify business processes into scalable, self-measuring systems.

  • Develop extensible, schema-driven platforms for reducing manual toil and ensuring operational consistency.

  • Drive integrations with container orchestration tools like Kubernetes and observability systems such as Prometheus, OpenTelemetry, Grafana.

  • Optimize the reliability and efficiency of cloud operations through automated workflows and telemetry systems.

  • Lead and ship impactful technical projects, ensuring quality and scalability at every stage

What we need to see:

  • 5-9+ years of industry experience with a Bachelor’s or Master’s degree (or equivalent experience), or 2+ years with a PhD.

  • Expertise in building GraphQL and REST APIs.

  • Proficiency in programming languages such as Go, Java, or Python.

  • Familiarity with modern JavaScript frameworks (e.g., React, Angular, Next.js).

  • Strong understanding of cloud infrastructure (AWS, GCP, Azure) and container technologies like Docker and Kubernetes.

  • Experience with high-scale distributed systems, including architectural patterns for APIs and data pipelines.

  • Outstanding communication and collaboration skills, with a focus on solving complex operational challenges.

  • A passion for automating manual processes and driving system efficiency.

Ways to Stand Out from the Crowd

  • A track record of designing workflow orchestration systems for large-scale infrastructure.

  • Proven experience in reducing operational inefficiencies through automation and integration.

  • Strong debugging and problem-solving skills in distributed environments.

NVIDIA is committed to creating an environment where diverse perspectives drive innovation. As part of the DGX Cloud team, you’ll work on ground breaking technology that powers the future of AI and cloud computing. NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. Our invention serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for great people like you to help us accelerate the next wave of artificial intelligence.

The base salary range is 136,000 USD - 264,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Flexera - Member Technical Staff - Site Reliability Engineer

Flexera

Bengaluru, Karnataka, India (Hybrid)
7 Months ago
Onward Search - Sr. Software Engineer

Onward Search

Santa Monica, California, United States (On-Site)
2 Months ago
Razer - Senior API Developer

Razer

Singapore (On-Site)
6 Months ago
ByteDance - Site Reliability Engineer, Edge Services

ByteDance

Boston, Massachusetts, United States (On-Site)
6 Months ago
ION - Senior DevSecOps Engineer, Italy

ION

London, England, United Kingdom (On-Site)
6 Months ago
VGW - Staff Site Reliability Engineer

VGW

Perth, Western Australia, Australia (On-Site)
2 Months ago
The Walt Disney Company - Manager, Database Reliability Engineering

The Walt Disney Company

Washington, United States (On-Site)
1 Month ago
ByteDance - Software Engineer, SRE - Platform Services

ByteDance

Seattle, Washington, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ByteDance - Senior Site Reliability Architect - Security Engineering - San Jose

ByteDance

San Jose, California, United States (On-Site)
4 Months ago
CloudHire - Senior Full Stack Architect: Angular & NestJS

CloudHire

Hyderabad, Telangana, India (Remote)
1 Month ago
Nielsen Holdings - Software Engineer (Java/Scala, SQL, AWS, Spark on Kubernetes)

Nielsen Holdings

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Ajmera Infotech - SENIOR ASP.NET DEVELOPER

Ajmera Infotech

Hyderabad, Telangana, India (On-Site)
9 Months ago
Velotio Technologies - Senior DevOps (Azure) Engineer

Velotio Technologies

Pune, Maharashtra, India (Remote)
4 Weeks ago
ION - Cloud Engineer Kubernetes

ION

Rome, Lazio, Italy (Hybrid)
6 Months ago
NVIDIA - AI and ML Infra Software Engineer, GPU Clusters

NVIDIA

Santa Clara, California, United States (On-Site)
3 Months ago
NVIDIA - Senior Software and System Architect

NVIDIA

California, Maryland, United States (Remote)
1 Month ago
Ubisoft - DevOps Linux System Administrator

Ubisoft

Montreal, Quebec, Canada (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in California, United States

Survios - Senior Technology Engineer - Automation, Analytics and Live Ops (Unreal Engine 5)

Survios

Marina Del Rey, California, United States (Remote)
1 Month ago
ByteDance - Fullstack Tech Lead - Global Payment - San Jose

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
NVIDIA - Senior GPU Hardware Security Architect, Memory Security and System Configuration

NVIDIA

Hillsboro, Oregon, United States (On-Site)
2 Months ago
ByteDance - Software Engineer, Cloud Infrastructure

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
Sitetracker - Small Business Account Executive (SMB)

Sitetracker

Boston, Massachusetts, United States (Remote)
6 Months ago
Rockstar Games - Senior Product Manager, Customer Experience

Rockstar Games

New York, New York, United States (On-Site)
2 Months ago
Daybreak Game Company LLC - Senior Software Engineer, Platform

Daybreak Game Company LLC

San Diego, California, United States (Remote)
5 Months ago
Life church - Frontend Staff Engineer

Life church

Edmond, Oklahoma, United States (On-Site)
6 Months ago
Gearbox Software - Senior Online Programmer

Gearbox Software

Frisco, Texas, United States (On-Site)
4 Months ago
Onward Search - Frontend Web Developer

Onward Search

Santa Monica, California, United States (Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

IO Interactive - Senior Build Engineer

IO Interactive

Malmö, Skåne County, Sweden (Hybrid)
1 Month ago
Ness Digital - DevOps Engineer

Ness Digital

Timișoara, Timiș, Romania (Hybrid)
3 Months ago
Nielsen Holdings - Software Engineer - Bigdata (Java/Scala and SQL)

Nielsen Holdings

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Balbix - Staff /Sr Staff/ Principal Engineer - Lakehouse

Balbix

Gurugram, Haryana, India (On-Site)
6 Months ago
DraftKings - Manager, System DBA Operations

DraftKings

Sofia, Sofia City Province, Bulgaria (On-Site)
5 Months ago
The Walt Disney Company - Lead Software Engineer

The Walt Disney Company

Orlando, Florida, United States (On-Site)
1 Month ago
Company3 Method Studios - Technical Architect D365 Finance & Operations

Company3 Method Studios

Maharashtra, India (Remote)
3 Months ago
Gaming Innovation Group  - DevOps Data Engineer

Gaming Innovation Group

St. Julian's, Malta (Hybrid)
1 Month ago
Zazz - Cloud Engineer (AWS)

Zazz

(Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug