Senior Solution Engineer, Mission Control

2 Months ago • 5 Years + • Artificial Intelligence • Research & Development • $136,000 PA - $264,500 PA

Job Summary

Job Description

NVIDIA seeks a Senior Solution Engineer for its Mission Control team, focusing on automating AI Factory operations. The role involves direct customer interaction, troubleshooting software issues, resolving customer problems, and collaborating with engineering teams. Responsibilities include providing technical support, creating support tools, owning customer issues from start to finish, and documenting interactions. Expertise in Linux, container technologies (Kubernetes), and experience with distributed GPU-accelerated workloads is crucial. The position requires strong problem-solving, communication, and organizational skills, along with proficiency in Python and experience with various AI/ML tools and frameworks.
Must have:
  • 5+ years of AI/ML engineering experience
  • Linux expertise for AI/ML workloads
  • Kubernetes experience on compute clusters
  • Excellent communication and problem-solving skills
  • Python proficiency, custom tool development
Good to have:
  • Experience with Chatbots, RAG pipelines, vector databases
  • Distributed training/inference workloads
  • GPU accelerated/cloud/virtualized environment experience
  • Docker/Kubernetes/Slurm experience
  • Experience with PyTorch or TensorFlow
  • C/C++ development experience
Perks:
  • Equity
  • Benefits

Job Details

NVIDIA is looking for an engineer who wants the buzz of direct customer interaction, and the reward of contributing to software and products. We want the right person to join our team of Solution Engineers working on the NVIDIA Mission Control, which automates the operations of AI Factories.  We need an expert engineer to triage customer software issues and resolve customer problems. You must have excellent problem-solving abilities and communication experience and be able to work on multiple projects and tasks. You must be strong in Linux, have solid programming skills, and possess experience working with containers and related technologies such as Kubernetes.  Experience analyzing the distributed GPU-accelerated workload performance is a plus.

What you'll be doing:

  • Provide direct support to our NVIDIA Enterprise customers and work to answer questions, reproduce, or resolve customer issues.

  • Work with engineering teams on customer issues, providing logs, reproduction information, and other triage information.

  • Create/update product and/or support tools.

  • Own and drive customer issues from inception to resolution.

  • Document customer interactions and better enhance our knowledge base.

  • Work with the latest hardware (e.g. GPUs, AI accelerators, high-speed interconnects) and software technologies such as parallel filesystems (e.g. Lustre, GPFS, WekaIO), Jupyter, and various ML frameworks and tools, Spark, Kubernetes, and Ceph

  • Occasional work on weekends and holidays to support customers

What we need to see:

  • Minimum of a BS in Computer Science, Electrical Engineering, or equivalent experience.

  • At least 5+ years of engineering experience with a proven track record in AI/ML-focused projects or enterprise-grade solutions.

  • Expertise analyzing, optimizing, and customizing Linux environments for AI/ML workloads.

  • Strong container orchestration/job scheduling experience on compute clusters, especially with Kubernetes

  • Professional-level communication experience, able to adjust to the technical level of the audience, and stay calm and focused in negative situations.

  • Excellent follow-up and organizational skills, with a love for solving problems.

  • Proficient in Python programming with the ability to develop scripts and build custom tools. Experience with parallel programming or GPU acceleration (e.g., CUDA) is highly desirable.
     

Ways to stand out from the crowd:

  • Experience with Chatbots, RAG pipelines, vector databases, distributed training or inference workloads

  • Experience developing in GPU accelerated / cloud / virtualized environments

  • Containerized solutions/job scheduling experience with knowledge of Docker and/or Kubernetes and/or Slurm, and/or experience analyzing software performance of distributed workloads

  • Experience with common deep learning frameworks such as PyTorch or TensorFlow

  • Experience developing with C/C++

The base salary range is 136,000 USD - 264,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

NVIDIA - Deep Learning Engineer, Datacenters

NVIDIA

Bengaluru, Karnataka, India (Hybrid)
2 Months ago
Alphasense - Join AlphaSense India Talent Community

Alphasense

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Inworld AI - Staff / Principal Machine Learning Engineer - USA

Inworld AI

Mountain View, California, United States (Remote)
6 Months ago
Canva - Machine Learning Engineer - Ecosystem Experiences

Canva

Surry Hills, New South Wales, Australia (Remote)
2 Months ago
Google - Staff Software Engineer, Machine Learning Performance, TPU

Google

Mountain View, California, United States (On-Site)
1 Month ago
CharacterAI - Software Engineer, Machine Learning Infrastructure

CharacterAI

New York, New York, United States (On-Site)
2 Months ago
Genies - Research Scientist Intern - LLM/Vision/Speech

Genies

San Mateo, California, United States (Hybrid)
2 Months ago
Google - Technical Solutions Engineer, AI/ML

Google

Bengaluru, Karnataka, India (On-Site)
1 Month ago
NVIDIA - Principal Technical Program Manager, AI and Enterprise Apps

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
Lionbridge Games - Games Language AI Specialist (Linguist)

Lionbridge Games

Masovian Voivodeship, Poland (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Meta - Software Engineer, Systems ML - Frameworks/Compilers/Kernels

Meta

Bellevue, Washington, United States (Remote)
1 Month ago
Tencent - Senior Staff Researcher

Tencent

Palo Alto, California, United States (On-Site)
7 Months ago
Ubisoft - ML OPS Senior _ Groupe Technologique Création de contenu

Ubisoft

Montreal, Quebec, Canada (On-Site)
5 Months ago
NVIDIA - Senior Software Engineer, AI Resiliency

NVIDIA

Redmond, Washington, United States (On-Site)
2 Months ago
Tencent - Senior Researcher: Artificial General Intelligence (Natural Language Processing)

Tencent

Washington, United States (On-Site)
3 Months ago
Google - Customer Engineer, Applied and Generative AI

Google

Jakarta, Jakarta, Indonesia (On-Site)
1 Month ago
NVIDIA - Senior Software Engineer - Automated Parallel Programming

NVIDIA

Santa Clara, California, United States (Remote)
4 Months ago
Playrix - Generative AI Engineer

Playrix

Ireland (Remote)
2 Months ago
Electronic Arts - Senior Software Engineer

Electronic Arts

Austin, Texas, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Durham, North Carolina, United States

Google - Software Engineer III, Infrastructure, Google Cloud NetInfra

Google

Madison, Wisconsin, United States (On-Site)
1 Month ago
InnoPhase IoT - PHY Verification Engineer/Lead

InnoPhase IoT

San Jose, California, United States (On-Site)
1 Month ago
ByteDance - Site Reliability Engineer

ByteDance

San Jose, California, United States (On-Site)
2 Months ago
Niantic - Wayfarer Operations Program Lead

Niantic

Sunnyvale, California, United States (Hybrid)
1 Month ago
Mattel  Inc  - Associate Manager, Global Brand Marketing - Barbie

Mattel Inc

El Segundo, California, United States (On-Site)
7 Months ago
Adyen - OFAC Counsel

Adyen

New York, New York, United States (On-Site)
1 Month ago
Snap Mobile INC - Account Executive

Snap Mobile INC

Harrisburg, Pennsylvania, United States (On-Site)
1 Month ago
Google - Program Manager II, NPI, Technical Infrastructure

Google

Sunnyvale, California, United States (On-Site)
1 Month ago
SimpliSafe - Software Engineer - Quality Automation and Engineering Enablement

SimpliSafe

Boston, Massachusetts, United States (Hybrid)
1 Month ago
Nintendo - Brand and IP Approvals Specialist

Nintendo

Redmond, Washington, United States (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Google - Photonic Engineer, Machine Learning Systems, Platforms

Google

Sunnyvale, California, United States (On-Site)
1 Month ago
Electronic Arts - Associate, Corporate Strategy

Electronic Arts

Redwood City, California, United States (Hybrid)
4 Months ago
Google - Hardware Engineer, PhD, Cloud Platforms

Google

Taipei City, Taiwan (On-Site)
1 Month ago
GoMotive - Computer Vision Engineer

GoMotive

Pakistan (Remote)
2 Months ago
ByteDance - Student Researcher (Doubao (Seed) - Foundation Model - MultiModal Generative Model)

ByteDance

San Jose, California, United States (On-Site)
2 Months ago
Google - Field Solutions Architect, Generative AI, Google Cloud

Google

Stockholm, Stockholm County, Sweden (On-Site)
1 Month ago
Google - Customer Engineer, Applied and Generative AI

Google

Jakarta, Jakarta, Indonesia (On-Site)
1 Month ago
Google - Staff Image Quality Evaluation Engineer, Silicon

Google

Mountain View, California, United States (On-Site)
1 Month ago
ByteDance - Research Scientist Graduate (Foundation Model - Vision and Language)

ByteDance

Seattle, Washington, United States (On-Site)
4 Months ago
Google - Senior Software Engineer, SDLC, Gemini Code Assist

Google

Sunnyvale, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Pune, Maharashtra, India (On-Site)

Taipei City, Taiwan (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug