Software Platform Support Engineer - GPU Cloud

6 Days ago • 2 Years + • DevOps • $76,000 PA - $172,500 PA

Job Summary

Job Description

NVIDIA's DGX Cloud team seeks passionate Software Platform Support Engineers to provide Tier 1 support for complex cloud platforms. Responsibilities include partnering with internal teams, troubleshooting issues, creating documentation (knowledge base articles, how-to guides), building support tooling, understanding user workloads, collaborating with engineering on solutions, and participating in on-call rotations. The role requires expertise in cloud deployments, Linux, Kubernetes, and data storage technologies, alongside strong troubleshooting and communication skills. Experience with SLURM, HPC, machine learning, and a customer-centric approach are highly valued.
Must have:
  • 2+ years supporting distributed software systems
  • 2+ years supporting end-user software platforms
  • Linux experience
  • Kubernetes expertise
  • Cloud platform (AWS, Azure, OCI, GCP) knowledge
  • Data storage technology understanding
  • Troubleshooting and communication skills
Good to have:
  • SLURM or HPC experience
  • Machine Learning/AI experience
  • Strong organizational skills
Perks:
  • Equity
  • Benefits

Job Details

The NVIDIA DGX Cloud organization is looking for passionate software support engineers to partner closely with our internal customers to support them on our internal platforms. This partnership requires you to gain a deep understanding of the customer needs, how their application(s) work, assist them in troubleshooting issues, and create documentation to make it easier for users to troubleshoot issues themselves. The support you provide will help our users have a better experience and help shape our platform. 

 

We expect you to have knowledge of supporting cloud-based deployments across compute, storage and networking environments. 

 

What will you be doing:

  • Partner with multiple internal teams to provide Tier 1 support for complex cloud platforms

  • Triage/investigate root cause of customer issues and escalate as needed 

  • File bugs and report issues while working closely with the Site Reliability team

  • Build tooling to improve customer support process and visibility

  • Document best practices, solutions, knowledge base articles, how to’s, and blog posts 

  • Deeply understand user workloads and use cases 

  • Partner with multiple internal teams to give feedback to engineering teams and develop solutions to aid in their success

  • Be part of an on call rotation to support production systems

 

What we need to see:

  • BS/MS degree in Computer science or related areas (or equivalent experience)

  • 2+ yrs of experience with supporting distributed software systems

  • 2+ yrs of experience supporting end user software platforms 

  • 2+ yrs of experience with Linux

  • Experience with Kubernetes as well as experience with AWS, Azure, OCI, and GCP 

  • Background of Infrastructure, Networking, Storage, and DevOps scripting/tooling

  • Understanding of data storage technologies (databases, file, block, blob)

  • Willingness to become an expert in DGX Cloud

  • Customer Service/Support Experience

  • Willingness to work up and down the stack as well as across multiple teams 

  • Strong skills in troubleshooting with outstanding communication skills 

 

Ways to stand out from the crowd:

  • SLURM or HPC previous experience

  • Machine Learning and/or AI experience (self-taught is great!)

  • A strong drive to work with internal customers and make them successful

  • A drive to improve process with strong organizational skills  

 

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for phenomenal people like you to help us accelerate the next wave of artificial intelligence.

The base salary range is 76,000 USD - 172,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Activision - Expert Engineer, Accounts and Authentication

Activision

Dublin, County Dublin, Ireland (Hybrid)
• 3 Months ago
PwC - Cloud DevSecOps Architect

PwC

Toronto, Ontario, Canada (Hybrid)
• 1 Month ago
LSEG (London Stock Exchange Group) - Data Scientist

LSEG (London Stock Exchange Group)

Bengaluru, Karnataka, India (Hybrid)
• 5 Months ago
ION - Senior DevSecOps Engineer, Italy

ION

London, England, United Kingdom (On-Site)
• 4 Months ago
Flutter Entertainment - Lead Data Scientist

Flutter Entertainment

Hyderabad, Telangana, India (Hybrid)
• 3 Months ago
PwC - IN_Associate_Azure Cloud Data Engineer_OneCloud _Advisory _Bangalore

PwC

Bengaluru, Karnataka, India (On-Site)
• 2 Months ago
ByteDance - Production System Engineer

ByteDance

Singapore (On-Site)
• 3 Weeks ago
Teradata - Senior Cloud Engineer

Teradata

Pune, Maharashtra, India (On-Site)
• 3 Months ago
Ambition - Data Engineer (python)

Ambition

Singapore (On-Site)
• 6 Months ago
Ubisoft - Storage Architect

Ubisoft

Montreal, Quebec, Canada (On-Site)
• 2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

The Walt Disney Company - Data Governance Engineer

The Walt Disney Company

New York, New York, United States (On-Site)
• 1 Hour ago
WebMD - Trainee Software Engineer

WebMD

Maharashtra, India (On-Site)
• 19 Hours ago
Interactive Brokers - Senior Platform Engineer - Design

Interactive Brokers

Fort Lauderdale, Florida, United States (Hybrid)
• 4 Months ago
Nagarro - Staff Engineer (Cloud Infrastructure)

Nagarro

Gurugram, Haryana, India (On-Site)
• 4 Months ago
SSC Technologies - Sr. Platform Engineer

SSC Technologies

Kansas, United States (Remote)
• 4 Months ago
PearlAbyss - Backend Web Developer

PearlAbyss

(On-Site)
• 1 Month ago
Intel Corporation - IDC (Intel Developer Cloud) Service Desk Engineer

Intel Corporation

San José, San José Province, Costa Rica (Hybrid)
• 2 Months ago
Saviynt - Software Architect - Privilege Access Management

Saviynt

United States (Remote)
• 4 Months ago
NVIDIA - Data Center System Software Architect, DGX Cloud

NVIDIA

Santa Clara, California, United States (Remote)
• 1 Month ago
Zeta - Site Reliability Engineer I (Payzapp)

Zeta

Bengaluru, Karnataka, India (On-Site)
• 4 Months ago

Get notifed when new similar jobs are uploaded

Jobs in United States

Meta - Product Design Engineer, Reality Labs

Meta

Seattle, Washington, United States (On-Site)
• 3 Months ago
Nintendo - Intern - User Experience

Nintendo

Redmond, Washington, United States (On-Site)
• 3 Months ago
Canva - Print Strategy Lead - Relocate to Australia

Canva

San Francisco, California, United States (On-Site)
• 1 Month ago
Axon - Auditor - IT & Accounting Process

Axon

Denver, Colorado, United States (Hybrid)
• 5 Days ago
Rockstar Games - Senior Software Engineer (C#)

Rockstar Games

Andover, Massachusetts, United States (On-Site)
• 5 Months ago
Saviynt - Software Architect - Privilege Access Management

Saviynt

United States (Remote)
• 4 Months ago
Magnopus - Previs Lead

Magnopus

Los Angeles, California, United States (Hybrid)
• 5 Months ago
NVIDIA - Research Scientist, Efficient Deep Learning - New College Grad 2025

NVIDIA

Santa Clara, California, United States (On-Site)
• 1 Month ago
Poke - Unity Developer - Ruby on Rails

Poke

California, United States (Hybrid)
• 5 Months ago
Second Dinner - Principal Technical Producer, Platform Services

Second Dinner

United States (Remote)
• 1 Month ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Microsoft - Senior Backend Engineer– Azure Video Indexer Group

Microsoft

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
• 1 Month ago
Rajalakshmi Institute of Technology - DevOps Lead - CI/CD with Gitlab Only

Rajalakshmi Institute of Technology

Bengaluru, Karnataka, India (Hybrid)
• 5 Months ago
Trend Micro - (Sr.) Software Engineer in Linux

Trend Micro

Taipei City, Taiwan (On-Site)
• 4 Months ago
Peak - Summer Intern, DevOps Engineer

Peak

İstanbul, Türkiye (On-Site)
• 1 Month ago
Luxoft - Senior AWS Automation Engineer with CICD

Luxoft

Bengaluru, Karnataka, India (On-Site)
• 3 Months ago
ByteDance - Production System Engineer

ByteDance

Singapore (On-Site)
• 3 Weeks ago
Infoblox - Staff/Senior Data Engineer

Infoblox

Pune, Maharashtra, India (Hybrid)
• 3 Months ago
Luxoft - KDB Developer

Luxoft

Chennai, Tamil Nadu, India (On-Site)
• 3 Months ago
Netflix - Solutions Support Engineer (L5) - Delivery

Netflix

United States (Remote)
• 1 Week ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.


Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Bengaluru, Karnataka, India (On-Site)

Taipei City, Taiwan (On-Site)

Taipei City, Taiwan (On-Site)

Shanghai, Shanghai, China (On-Site)

Shanghai, Shanghai, China (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug