Manager, HPC Support Engineering

11 Minutes ago • All levels • $160,000 PA - $240,000 PA
Software Development & Engineering

Job Description

We are looking for a hands-on and customer-focused HPC Support Engineering Manager to lead our Tier III Support Engineering team supporting customers on Private Cloud GPU clusters. You’ll be responsible for guiding a team of HPC Support Engineers, ensuring escalations are handled with speed and consistency, and driving a high standard of technical excellence and customer experience. This role requires both strong technical depth in HPC and the ability to lead, mentor, and collaborate across Support, Product, Engineering, and Sales. You’ll also play a critical role in shaping the supportability of products by representing customer experience in internal discussions.
Good To Have:
  • Advanced degree in Computer Science, Engineering, or related field.
  • Certifications in HPC, networking, or related technologies.
  • Experience with Slurm, Kubernetes, InfiniBand, and other high-performance interconnects (RoCE, NVLink/NVSwitch).
  • Background supporting Private Cloud environments or other dedicated enterprise clusters.
  • Experience supporting enterprise AI workloads across startups and Fortune 500 companies.
Must Have:
  • Proven experience leading technical support or engineering teams, with a track record of building high-performing groups that deliver strong customer outcomes.
  • Skilled at managing escalations, providing clear direction under pressure, and serving as the point of leadership in critical customer situations.
  • Strong knowledge of HPC clusters, including GPU/InfiniBand systems, networking, and node-level troubleshooting.
  • Advanced Linux administration and diagnostic skills.
  • Skilled at motivating teams, setting direction, and developing engineers into strong technical contributors.
  • Strong analytical and problem-solving skills with a proactive, action-oriented mindset.
  • Action-oriented, accountable, and able to align team priorities with company and customer goals.
Perks:
  • Generous cash & equity compensation
  • Health, dental, and vision coverage for you and your dependents
  • Wellness and Commuter stipends for select roles
  • 401k Plan with 2% company match (USA employees)
  • Flexible Paid Time Off Plan that we all actually use

Add these skills to join the top 1% applicants for this job

problem-solving
game-texts
networking
linux
kubernetes
machine-learning

About the role

We are looking for a hands-on and customer-focused HPC Support Engineering Manager to lead our Tier III Support Engineering team supporting customers on Private Cloud GPU clusters.

You’ll be responsible for guiding a team of HPC Support Support Engineers, ensuring escalations are handled with speed and consistency, and driving a high standard of technical excellence and customer experience. This role requires both strong technical depth in HPC and the ability to lead, mentor, and collaborate across Support, Product, Engineering, and Sales. You’ll also play a critical role in shaping the supportability of products by representing customer experience in internal discussions.

This position reports to the Manager of Support Operations and includes participation in an on-call rotation.

What You'll Do

  • Lead, coach, and mentor a team of HPC Support Engineers, fostering both technical growth and customer-first execution.
  • Ensure the highest quality of support for customers, who depend on our products for mission-critical workloads.
  • Own customer escalations and incidents, engaging directly with enterprise customers during high-visibility situations.
  • Partner with Product and Engineering teams to influence design decisions and ensure future offerings are supportable and reliable.
  • Stay current on the latest HPC and NVIDIA technologies, applying that knowledge to improve customer outcomes.
  • Develop and refine support processes, documentation, and workflows to ensure consistency and best practices.
  • Monitor and report on team performance, driving improvements in responsiveness, resolution quality, and customer satisfaction.
  • Manage team schedules, including on-call responsibilities, to ensure 24/7 coverage for critical issues.
  • Lead by example — actively participating in troubleshooting and case resolution when needed.

You

  • Proven experience leading technical support or engineering teams, with a track record of building high-performing groups that deliver strong customer outcomes.
  • Skilled at managing escalations, providing clear direction under pressure, and serving as the point of leadership in critical customer situations.
  • Strong knowledge of HPC clusters, including GPU/InfiniBand systems, networking, and node-level troubleshooting.
  • Advanced Linux administration and diagnostic skills.
  • Skilled at motivating teams, setting direction, and developing engineers into strong technical contributors.
  • Strong analytical and problem-solving skills with a proactive, action-oriented mindset.
  • Action-oriented, accountable, and able to align team priorities with company and customer goals.

Nice to have

  • Advanced degree in Computer Science, Engineering, or related field.
  • Certifications in HPC, networking, or related technologies.
  • Experience with Slurm, Kubernetes, InfiniBand, and other high-performance interconnects (RoCE, NVLink/NVSwitch).
  • Background supporting Private Cloud environments or other dedicated enterprise clusters.
  • Experience supporting enterprise AI workloads across startups and Fortune 500 companies.

Salary Range Information

This is a salaried exempt role. The annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.

About

  • Founded in 2012, ~400 employees (2025) and growing fast
  • We offer generous cash & equity compensation
  • Our investors include Andra Capital, SGW, Andrej Karpathy, ARK Invest, Fincadia Advisors, G Squared, In-Q-Tel (IQT), KHK & Partners, NVIDIA, Pegatron, Supermicro, Wistron, Wiwynn, US Innovative Technology, Gradient Ventures, Mercato Partners, SVB, 1517, Crescent Cove.
  • We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitability
  • Our research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOG
  • Health, dental, and vision coverage for you and your dependents
  • Wellness and Commuter stipends for select roles
  • 401k Plan with 2% company match (USA employees)
  • Flexible Paid Time Off Plan that we all actually use

A Final Note:

You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.

Equal Opportunity Employer

is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.

Set alerts for more jobs like Manager, HPC Support Engineering
Set alerts for new jobs by Lambda
Set alerts for new Software Development & Engineering jobs in United States
Set alerts for new jobs in United States
Set alerts for Software Development & Engineering (Remote) jobs
Contact Us
hello@outscal.com
Made in INDIA 💛💙