Senior Technical Program Manager - Infrastructure Capacity Management

1 Month ago • 10 Years + • Operations • Product Management • $192,000 PA - $304,750 PA

Job Summary

Job Description

NVIDIA's Hardware Infrastructure team seeks a Senior Technical Program Manager to lead the strategy and execution of capacity forecasting, planning, allocation, and management across internal clusters. This role involves working with multiple internal customer teams to build demand models for compute, storage, and network resources. The successful candidate will shape technical strategy, foster continuous improvement, and lead engineering efforts using agile methodologies. They will define strategies to increase resource efficiency, utilize data-driven approaches (metrics, OKRs, KPIs), and create effective communication channels. The role also requires strong cross-functional collaboration with developers, customers, and partners.
Must have:
  • 10+ years exp in software engineering/TPM
  • Experience forecasting and managing infrastructure resources
  • Lead programs across multiple teams (100+)
  • Manage large-scale HPC/AI infrastructure deployments
  • Exceptional communication and presentation skills
  • Agile methodologies & project management tools expertise
Good to have:
  • Experience with cloud service providers
  • Experience working with AI researchers/EDA developers
  • Software development, release, and DevOps knowledge
Perks:
  • Highly competitive salaries
  • Comprehensive benefits package

Job Details

Hardware Infrastructure is seeking a Senior Technical Program Manager to lead the strategy and execution of programs to support capacity forecasting, planning, allocation and management across our internal clusters. The GPU infrastructure we build and operate enables NVIDIA's most advanced AI and hardware researchers and engineers to create the future of computing. The scope of the capacity management work spans across compute, storage and network to ensure we have infrastructure that is functional, performant and reliable. This is a fast paced and evolving landscape that requires a senior TPM leader to guide engineering roadmaps to be delivered with high quality outcomes and a strong foundation of operational excellence. They will partner both internally within Hardware Infrastructure and externally with senior management and partner teams to scale the capacity management lifecycle. They will develop and standardize planning, reporting and execution methodologies and metrics to enable meeting the challenging objectives.

What You'll Be Doing:

  • Work across multiple internal customer teams to build robust demand models that accurately provide a comprehensive picture of capacity requirements across compute, storage and network

  • Assist and play a key role in shaping the technical strategy and execution for how our internal serving platform meets internal customer needs

  • Nurture a culture of continuous improvement, finding new opportunities across tooling, automation and processes to scale overall capacity management

  • Take lead in defining strategies that will help increase the efficiency and utilization of resources across internal clusters to minimize capacity waste

  • Guide a diverse set of engineering efforts in an agile program methodology across planning, prioritization, design, dependency management, implementation and execution.

  • Bring a data-first approach to programs (metrics, OKRs, KPIs) to measure program success and for identifying areas of improvement

  • Create effective communication channels to provide varying audience levels insights into program status, risks and opportunities.

  • Act as an effective technical and non-technical liaison between developers, customers and partners to drive organization alignment across a multi-functional matrixed set of leads

What We Need To See:

  • B.S. (or equivalent experience) in Computer Science or a related technical field

  • 10+ years of experience across software engineering and/or technical program management roles with demonstrated expertise and mastery of technical and management practices

  • Prior experience developing process and tools to forecast, allocate and manage infrastructure resources across a diverse and large portfolio ($billions)

  • Prior experience leading programs that span across multiple teams and engineers (100+)

  • Experience managing large scale HPC and/or AI Infrastructure deployments that stretch across hardware and software

  • Exceptional communication and presentation skills for diverse technical and non-technical audiences

  • Strong multitasking abilities with a focus on thoroughness and rapid context switching

  • Knowledge of agile methodologies and the best in class project management tools

  • Proactive and enthusiastic in identifying and implementing positive changes in software engineering and release management within a fast-paced environment

Ways To Stand Out From The Crowd:

  • Prior experience bringing up new datacenter capacity across cloud service providers and on-premise locations

  • Prior background in working with AI researchers and/or EDA developers

  • Software development, release and support methodology and devops

NVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most forward-thinking and hardworking people in the world on our team and our collaborative talent continues to drive NVIDIA's growth. We are seeking creative and independent engineers with real passion for technology!

The base salary range is 192,000 USD - 304,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Every matrix - Senior Animator

Every matrix

Stockholm, Stockholm County, Sweden (Hybrid)
• 3 Months ago
The Walt Disney Company - 2025 Summer Internship - Entertainment Production

The Walt Disney Company

Hong Kong (On-Site)
• 1 Month ago
Crytek - Senior Animation Programmer

Crytek

(Remote)
• 2 Months ago
Riot Games - Software Engineer - Platform & Tools (Contractor)

Riot Games

Dublin, County Dublin, Ireland (On-Site)
• 3 Months ago
Assystems - Design Engineer - Instrumentation (Substation)

Assystems

Gurugram, Haryana, India (On-Site)
• 3 Months ago
Paytm - Area Sales Manager- Deputy Manager - Bellary

Paytm

Ballari, Karnataka, India (On-Site)
• 2 Months ago
Sinch - Senior Manager, IT Internal Audit

Sinch

Atlanta, Georgia, United States (Hybrid)
• 3 Months ago
Tesla - Area Sales Manager / Regionalleitung, NĂĽrnberg (m/w/d)

Tesla

Nuremberg, Bavaria, Germany (On-Site)
• 5 Days ago
Tesla - Service Advisor

Tesla

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
• 1 Week ago
Sporty Group - BG Sports Trading Analyst - Night Shifts

Sporty Group

Bulgaria (Remote)
• 3 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Scientific Games  - Lead Tech Ops Engineer

Scientific Games

Bengaluru, Karnataka, India (On-Site)
• 4 Weeks ago
Sinch - Technical Support Agent

Sinch

Mandaluyong, Metro Manila, Philippines (Remote)
• 3 Weeks ago
ByteDance - Software Engineer(User Product) - Global Payment - Singapore

ByteDance

Singapore (On-Site)
• 3 Months ago
Bohemia Interactive - Art Lead

Bohemia Interactive

Prague, Prague, Czechia (On-Site)
• 3 Months ago
DEVOTEAM - Distributed Cloud | Multicloud FinOps

DEVOTEAM

Lisbon, Lisbon, Portugal (Remote)
• 4 Months ago
Playtika - Graphic Designer - Temporary

Playtika

Israel (On-Site)
• 1 Month ago
BSH Home Appliances India - Architect MES Foundation

BSH Home Appliances India

Bengaluru, Karnataka, India (On-Site)
• 3 Months ago
ByteDance - Senior Research Scientist- Foundation Model, Vision and Language

ByteDance

San Jose, California, United States (On-Site)
• 3 Months ago
Gulliver's Games - Data Analyst/Engineer

Gulliver's Games

Ä°stanbul, TĂĽrkiye (On-Site)
• 4 Months ago
ByteDance - ByteDance Back-end Engineer Graduate Program (Dubai 2025)

ByteDance

Dubai, Dubai, United Arab Emirates (On-Site)
• 1 Day ago

Get notifed when new similar jobs are uploaded

Jobs in Santa Clara, California, United States

DraftKings - Operations Associate

DraftKings

Reynoldsburg, Ohio, United States (On-Site)
• 1 Month ago
Netflix - Data Engineer (L5) - Security

Netflix

United States (Remote)
• 3 Months ago
NVIDIA - Senior Software Engineer - RTL Design Tools

NVIDIA

Santa Clara, California, United States (On-Site)
• 1 Month ago
Microsoft - Senior Researcher – Generative AI – Microsoft Research AI Frontiers

Microsoft

Redmond, Washington, United States (On-Site)
• 1 Month ago
AGBO - Senior Director, Financial Planning & Analysis

AGBO

Los Angeles, California, United States (On-Site)
• 4 Months ago
Ludeo - Marketing Director

Ludeo

Los Angeles, California, United States (On-Site)
• 1 Month ago
Velan Studios - Technical Animator (Character Rigger)

Velan Studios

Troy, New York, United States (Hybrid)
• 6 Months ago
Fluence - Sr. Manager People Platforms

Fluence

Arlington, Virginia, United States (Hybrid)
• 4 Months ago
The Walt Disney Company - Manager, Asset Protection E-Commerce Investigations

The Walt Disney Company

Celebration, Florida, United States (On-Site)
• 2 Weeks ago
Sphere Entertainment Co - Manager, Lighting Systems

Sphere Entertainment Co

Las Vegas, Nevada, United States (On-Site)
• 1 Month ago

Get notifed when new similar jobs are uploaded

Operations Jobs

CloudHire - Operations Support Specialist

CloudHire

MedellĂ­n, Antioquia, Colombia (Remote)
• 4 Months ago
Mattel  Inc  - 1st Shift Warehouse Supervisor

Mattel Inc

Texas, United States (On-Site)
• 2 Months ago
IGT - Supervisor, Field Services

IGT

San Diego, California, United States (On-Site)
• 2 Months ago
Tesla - Associate Technical Support Engineer (French Speaker)

Tesla

North Holland, Netherlands (On-Site)
• 1 Week ago
Techland - Risk Officer

Techland

Warsaw, Masovian Voivodeship, Poland (On-Site)
• 7 Months ago
Sporty Group - IN Lead- Customer Success (Gurugram)

Sporty Group

Delhi, India (On-Site)
• 6 Months ago
Tencent - Studio Operations Director

Tencent

Bellevue, Washington, United States (On-Site)
• 2 Weeks ago
Axinous - Director, Customer Success

Axinous

Tokyo, Japan (Hybrid)
• 3 Months ago
Tesla - Benefits Partner

Tesla

Bavaria, Germany (On-Site)
• 1 Week ago
Krafton  - Deep Learning Strategy & Operations Associate

Krafton

Seoul, South Korea (On-Site)
• 5 Days ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.


Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Bengaluru, Karnataka, India (On-Site)

Taipei City, Taiwan (On-Site)

Taipei City, Taiwan (On-Site)

Shanghai, Shanghai, China (On-Site)

Shanghai, Shanghai, China (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug