Outscal Logooutscal logo

Senior Solution Engineer, Mission Control

20 Hours ago • 5 Years + • Artificial Intelligence • Research & Development • $136,000 PA - $264,500 PA

Job Summary

Job Description

NVIDIA seeks a Senior Solution Engineer for its Mission Control team, focusing on automating AI Factory operations. The role involves direct customer interaction, troubleshooting software issues, resolving customer problems, and collaborating with engineering teams. Responsibilities include providing technical support, creating support tools, owning customer issues from start to finish, and documenting interactions. Expertise in Linux, container technologies (Kubernetes), and experience with distributed GPU-accelerated workloads is crucial. The position requires strong problem-solving, communication, and organizational skills, along with proficiency in Python and experience with various AI/ML tools and frameworks.
Must have:
  • 5+ years of AI/ML engineering experience
  • Linux expertise for AI/ML workloads
  • Kubernetes experience on compute clusters
  • Excellent communication and problem-solving skills
  • Python proficiency, custom tool development
Good to have:
  • Experience with Chatbots, RAG pipelines, vector databases
  • Distributed training/inference workloads
  • GPU accelerated/cloud/virtualized environment experience
  • Docker/Kubernetes/Slurm experience
  • Experience with PyTorch or TensorFlow
  • C/C++ development experience
Perks:
  • Equity
  • Benefits

Job Details

NVIDIA is looking for an engineer who wants the buzz of direct customer interaction, and the reward of contributing to software and products. We want the right person to join our team of Solution Engineers working on the NVIDIA Mission Control, which automates the operations of AI Factories.  We need an expert engineer to triage customer software issues and resolve customer problems. You must have excellent problem-solving abilities and communication experience and be able to work on multiple projects and tasks. You must be strong in Linux, have solid programming skills, and possess experience working with containers and related technologies such as Kubernetes.  Experience analyzing the distributed GPU-accelerated workload performance is a plus.

What you'll be doing:

  • Provide direct support to our NVIDIA Enterprise customers and work to answer questions, reproduce, or resolve customer issues.

  • Work with engineering teams on customer issues, providing logs, reproduction information, and other triage information.

  • Create/update product and/or support tools.

  • Own and drive customer issues from inception to resolution.

  • Document customer interactions and better enhance our knowledge base.

  • Work with the latest hardware (e.g. GPUs, AI accelerators, high-speed interconnects) and software technologies such as parallel filesystems (e.g. Lustre, GPFS, WekaIO), Jupyter, and various ML frameworks and tools, Spark, Kubernetes, and Ceph

  • Occasional work on weekends and holidays to support customers

What we need to see:

  • Minimum of a BS in Computer Science, Electrical Engineering, or equivalent experience.

  • At least 5+ years of engineering experience with a proven track record in AI/ML-focused projects or enterprise-grade solutions.

  • Expertise analyzing, optimizing, and customizing Linux environments for AI/ML workloads.

  • Strong container orchestration/job scheduling experience on compute clusters, especially with Kubernetes

  • Professional-level communication experience, able to adjust to the technical level of the audience, and stay calm and focused in negative situations.

  • Excellent follow-up and organizational skills, with a love for solving problems.

  • Proficient in Python programming with the ability to develop scripts and build custom tools. Experience with parallel programming or GPU acceleration (e.g., CUDA) is highly desirable.
     

Ways to stand out from the crowd:

  • Experience with Chatbots, RAG pipelines, vector databases, distributed training or inference workloads

  • Experience developing in GPU accelerated / cloud / virtualized environments

  • Containerized solutions/job scheduling experience with knowledge of Docker and/or Kubernetes and/or Slurm, and/or experience analyzing software performance of distributed workloads

  • Experience with common deep learning frameworks such as PyTorch or TensorFlow

  • Experience developing with C/C++

The base salary range is 136,000 USD - 264,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs

Henkel - Data Scientist-Intern

Henkel

Pune, Maharashtra, India (On-Site)
6 Months ago
Seedify - AI Product Manager

Seedify

India (Remote)
2 Months ago
Sony Interactive Entertainment - AI/機械学習エンジニア

Sony Interactive Entertainment

Tokyo, Japan (On-Site)
3 Months ago
ByteDance - Video Analysis and Quality Algorithm Intern 2023 Summer/Fall (MS)

ByteDance

San Jose, California, United States (On-Site)
4 Months ago
NVIDIA - Senior Developer Relations Manager - Robotics

NVIDIA

Tokyo, Japan (On-Site)
2 Months ago
Meta - Software Engineer, Systems ML - SW/HW Co-design

Meta

Bellevue, Washington, United States (Remote)
4 Months ago
Google - Senior Software Engineer, Core Machine Learning, Google Cloud

Google

New York, New York, United States (On-Site)
4 Months ago
CharacterAI - Software Engineer, Machine Learning Infrastructure

CharacterAI

New York, New York, United States (On-Site)
1 Day ago
VGW - Senior Machine Learning Engineer

VGW

Perth, Western Australia, Australia (On-Site)
1 Day ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ByteDance - DevOps Engineer - Applied Machine Learning Engine (Singapore)

ByteDance

Singapore (On-Site)
4 Months ago
Rackspace Technology - Machine Learning Architect (AWS)

Rackspace Technology

(Remote)
2 Months ago
Ubisoft - Senior ML Data Scientist

Ubisoft

Montreal, Quebec, Canada (On-Site)
2 Months ago
Meta - Software Engineer, Machine Learning

Meta

Seattle, Washington, United States (On-Site)
4 Months ago
Rackspace Technology - Principal MLOps Engineer

Rackspace Technology

San Antonio, Texas, United States (Remote)
1 Day ago
ByteDance - Research Scientist in Foundation Model, Speech Understanding - 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
4 Months ago
Google - Data Scientist, Extended Workforce Solutions

Google

(On-Site)
3 Months ago
ByteDance - Research Scientist- Applied Machine learning Graduates (AML) - 2024 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
4 Months ago
Electronic Arts - Data Science Engineer

Electronic Arts

Hyderabad, Telangana, India (On-Site)
3 Weeks ago
Meta - Research Scientist Intern, Smart Glasses in Wearables AI (PhD)

Meta

Redmond, Washington, United States (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Durham, North Carolina, United States

Axon - Director, Content Strategy & Operations

Axon

Scottsdale, Arizona, United States (On-Site)
3 Months ago
Epic Games - Senior Engine Programmer, Framework

Epic Games

Cary, North Carolina, United States (On-Site)
2 Months ago
ByteDance - Research Engineer Graduate (Vision AI Platform)

ByteDance

Seattle, Washington, United States (On-Site)
1 Day ago
PlayStation Global - Manager, Sales and Revenue Forecasting

PlayStation Global

Aliso Viejo, California, United States (Remote)
1 Month ago
CD PROJEKT RED - Senior Weapons Artist

CD PROJEKT RED

Boston, Massachusetts, United States (On-Site)
1 Month ago
Twitch - Software Development Engineer

Twitch

San Francisco, California, United States (On-Site)
1 Month ago
31st Union - Senior Test Automation Engineer

31st Union

San Mateo, California, United States (On-Site)
6 Days ago
ByteDance - Senior Software Engineer - MySQL

ByteDance

Seattle, Washington, United States (On-Site)
1 Month ago
PlayQ - Don't See What You're Looking For?

PlayQ

Santa Monica, California, United States (On-Site)
6 Months ago
Sitetracker - Small Business Account Executive (SMB)

Sitetracker

Montclair, New Jersey, United States (Remote)
5 Months ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Zoox - Software Engineer - Perception

Zoox

Foster City, California, United States (Hybrid)
5 Months ago
Airlab Inc  - Artificial Intelligence Researcher

Airlab Inc

Montreal, Quebec, Canada (On-Site)
8 Months ago
Scale AI - QA Engineer, Generative AI

Scale AI

Argentina (On-Site)
5 Months ago
ByteDance - Research Scientist, Foundation Model, Speech Understanding

ByteDance

Seattle, Washington, United States (On-Site)
4 Months ago
Krafton  - Deep Learning Engineer - LLM Game Agent

Krafton

Seoul, South Korea (On-Site)
1 Month ago
GoMotive - Computer Vision Engineer

GoMotive

Pakistan (Remote)
1 Week ago
Zoox - Senior/Staff Software Engineer - Simulation Infrastructure

Zoox

Foster City, California, United States (Hybrid)
5 Months ago
Meta - AI Research Scientist, Language - Generative AI

Meta

Burlingame, California, United States (On-Site)
4 Months ago
Krafton  - [Global Strategy & BD Div.] Strategy Manager(AI Ethics) (4년 ~ 7년)

Krafton

Seoul, South Korea (On-Site)
3 Months ago
Zoox - Senior/Staff Machine Learning Engineer - Prediction & Behavior ML

Zoox

Boston, Massachusetts, United States (Hybrid)
5 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.


Westford, Massachusetts, United States (Hybrid)

Westford, Massachusetts, United States (On-Site)

Seattle, Washington, United States (On-Site)

Canada (On-Site)

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

North Carolina, United States (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug