DGX Cloud Infrastructure Engineering Intern - Fall 2025

1 Month ago • Upto 1 Years • Research & Development

Job Summary

Job Description

NVIDIA is seeking a DGX Cloud Infrastructure Engineering Intern for Fall 2025 to contribute to scaling its AI infrastructure. Responsibilities include designing and architecting a platform for automated GPU asset management across cloud providers; developing, testing, and optimizing solutions for datacenter firmware; collaborating with hardware, software, and business teams; defining server-level reliability requirements; driving failure analysis; and ensuring seamless software integration across the entire stack. The ideal candidate possesses a strong programming background, understands distributed systems, and has experience with software testing, deployment, and GPU computing. Strong communication and problem-solving skills are essential.
Must have:
  • Strong programming (C, C++, Python)
  • Distributed systems understanding
  • Software testing & deployment experience
  • GPU computing (CUDA, OpenCL)
  • Deep Learning Frameworks (PyTorch, TensorFlow)
Good to have:
  • Perl
  • OpenACC
  • Caffe
  • HPC (MPI, OpenMP)
  • Performance Modeling & Optimization
Perks:
  • Intern benefits

Job Details

NVIDIA is hiring engineers to scale up its AI Infrastructure. We expect you to have a strong programming background, a deep understanding of distributed systems, familiarity with software testing and deployment, and excellent communication and planning abilities. We also welcome out-of-the-box thinkers who can provide new ideas with strong at execution bias. Expect to be constantly challenged, improving, and evolving for the better. You and other engineers in this team will help advance NVIDIA's capacity to build and deploy leading infrastructure solutions for a broad range of AI-based applications that affect core data science. What are you waiting for if you're creative, passionate about what you do, and love having fun apply today!


What you’ll be doing:

  • We are designing and architecting a comprehensive platform that automates GPU asset provisioning, configuration, and lifecycle management across cloud providers.
  • Design, develop, test, debug, and optimize creative solutions for Datacenter firmware throughout lifecycle.
  • Work closely with hardware, software, infrastructure, and business teams to transform new firmware features from idea to reality.
  • Define server-level reliability, availability, and serviceability requirements in collaboration with various customers like CSPs and deliver fault resilient solution at scale as per customer expectations.
  • Collaborate with hardware, software and firmware teams to drive failure analysis and large scale solution deployment.
  • Work with engineering teams across NVIDIA to ensure your software integrates seamlessly from the hardware all the way up to the AI training applications.

What we need to see: 

  • Currently pursuing a Bachelor's, Master's, or PhD degree within Computer Engineering, Electrical Engineering, Computer Science, or a related field 
  • Course or internship experience related to the following areas required: Computer Architecture, Deep Learning or Machine Learning, GPU computing and Parallel Programming, Performance Modeling, profiling, optimizing, and/or analysis.
  • Prior experience or knowledge required on the following programming skills and technologies: C, C++, Python, Perl, GPU Computing (CUDA, OpenCL, OpenACC), Deep Learning Frameworks (PyTorch, TensorFlow, Caffe), HPC (MPI, OpenMP) 

The hourly rate for our interns is 18 USD - 71 USD. Our internship hourly rates are a standard pay determined based on the position and your location, year in school, degree, and experience.

You will also be eligible for Intern benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

ByteDance - Machine Learning Engineer Intern (E-commerce-Supply Chain & Logistics)

ByteDance

Seattle, Washington, United States (On-Site)
2 Weeks ago
Match Group - Sr. Software Engineer, Generative AI

Match Group

Palo Alto, California, United States (Hybrid)
6 Months ago
ByteDance - Research Scientist in Foundation Model, Speech Understanding - 2024 Start (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
6 Months ago
Niantic - 2025 R&D Software Engineering Intern (PhD, Publishing)

Niantic

London, England, United Kingdom (Hybrid)
5 Months ago
Google - Staff Software Engineer, ML Performance, GPUs

Google

Kirkland, Washington, United States (On-Site)
2 Weeks ago
Google - CPU Design Manager, Hardware

Google

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Weeks ago
Google - Lead CPU Performance Architect, Silicon

Google

Austin, Texas, United States (On-Site)
2 Weeks ago
NVIDIA - Senior GPU Memory Architect

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
ByteDance - Software Engineer, Machine Learning Training

ByteDance

Singapore (On-Site)
2 Weeks ago
NVIDIA - Senior Circuit Design Engineer

NVIDIA

Austin, Texas, United States (Hybrid)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Google - Senior Software Engineer, Distributed Machine Learning

Google

Mountain View, California, United States (On-Site)
2 Weeks ago
Kwalee - Machine Learning Engineer

Kwalee

Royal Leamington Spa, England, United Kingdom (On-Site)
1 Month ago
Canva - Machine Learning Engineer Intern

Canva

Sydney, New South Wales, Australia (Remote)
3 Weeks ago
NVIDIA - Senior AI-HPC Storage Engineer

NVIDIA

Austin, Texas, United States (On-Site)
2 Months ago
Meta - Software Engineer, Systems ML - Frameworks/Compilers/Kernels

Meta

Bellevue, Washington, United States (Remote)
2 Weeks ago
ByteDance - Research Engineer - Multimodal Model

ByteDance

Singapore (On-Site)
6 Months ago
Avathon - Data Scientist

Avathon

Bengaluru, Karnataka, India (On-Site)
6 Months ago
PwC - Senior Data Scientist

PwC

Kuala Lumpur, Federal Territory Of Kuala Lumpur, Malaysia (On-Site)
7 Months ago
Reality Games - Machine Learning Engineer - Monopoly World

Reality Games

Kraków, Lesser Poland Voivodeship, Poland (On-Site)
2 Months ago
Samsung Semiconductor - Principal Engineer, NPU Architect

Samsung Semiconductor

San Jose, California, United States (Hybrid)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Santa Clara, California, United States

NVIDIA - Senior Software Engineer, CAD Tool Development

NVIDIA

California, United States (Hybrid)
3 Weeks ago
Google - Senior Software Engineer, Android

Google

Mountain View, California, United States (On-Site)
2 Weeks ago
The Walt Disney Company - Manager, Sports Brand Solutions

The Walt Disney Company

New York, New York, United States (On-Site)
2 Weeks ago
Google - Data Center Plant Engineer

Google

Lincoln, Nebraska, United States (On-Site)
2 Weeks ago
Universal Music - Senior Manager, Controls Assurance

Universal Music

California, United States (On-Site)
2 Months ago
PlayStation Global - Senior Program Manager, Ecommerce

PlayStation Global

Carlsbad, California, United States (On-Site)
3 Weeks ago
Microsoft - Principal Product Manager

Microsoft

Redmond, Washington, United States (On-Site)
2 Weeks ago
Aristocrat Gaming - Field Engineer I

Aristocrat Gaming

Tulsa, Oklahoma, United States (Remote)
1 Month ago
Epic Games - Knowledge Manager

Epic Games

United States (On-Site)
3 Months ago
Netflix - Design Standards Lead

Netflix

Los Angeles, California, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Research & Development Jobs

NVIDIA - Senior Silicon Product Definition Engineer

NVIDIA

Canada (Hybrid)
3 Weeks ago
NVIDIA - Senior CPU Design Engineer

NVIDIA

Hillsboro, Oregon, United States (Hybrid)
1 Month ago
Google - Software Engineer, Device Enablement, Chrome OS

Google

Taipei City, Taiwan (On-Site)
2 Weeks ago
Krafton  - Applied Research Engineer - Reinforcement Learning

Krafton

Seoul, South Korea (On-Site)
1 Month ago
ByteDance - Research Scientist in Large Model System

ByteDance

Seattle, Washington, United States (On-Site)
6 Months ago
NVIDIA - Senior Applied LLM Engineer, AI – Chip Design

NVIDIA

Canada (On-Site)
2 Months ago
Rivos - SOC Electrical Analysis Engineer - Full Time

Rivos

Hsinchu, Hsinchu City, Taiwan (Hybrid)
6 Months ago
Google - Staff Software Engineer, AI/ML Recommendations, Rankings, Predictions, YouTube

Google

Mountain View, California, United States (On-Site)
2 Weeks ago
Playtika - R&D Team Leader

Playtika

Romania (Hybrid)
6 Months ago
NVIDIA - Physical Design Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Austin, Texas, United States (Remote)

Santa Clara, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug