NIM Solution Architect

23 Hours ago • 3 Years + • Artificial Intelligence

Job Summary

Job Description

As a NIM Solution Architect at NVIDIA, you will drive the implementation and deployment of NVIDIA Inference Microservice (NIM) solutions. Responsibilities include using NIM Factory Pipeline to package optimized models into containers, refining NIM tools for the community, designing agentic AI solutions using NIMs, delivering technical projects and demos, providing client support, collaborating with cross-functional teams, and championing NVIDIA software within the technical community. You'll also support the NVAIE team and contribute to their business in China. This role requires expertise in deploying and optimizing large language models, proficiency in inference frameworks (TensorRT, ONNX Runtime, PyTorch), strong Python/C++ programming, and familiarity with DevOps/MLOps practices.
Must have:
  • 3+ years experience
  • LLM deployment & optimization
  • Inference framework proficiency (TensorRT, etc.)
  • Python/C++ programming skills
  • DevOps/MLOps experience
  • Problem-solving & troubleshooting skills
Good to have:
  • Experience with field LLM projects
  • TensorRT expertise
  • AI workflow design experience
  • Cluster resource management tools
  • Agile methodologies
  • CUDA optimization experience
  • Large-scale HPC/enterprise system design

Job Details

NVIDIA is leading company of AI computing. At NVIDIA, our employees are passionate about AI, HPC , VISUAL, GAMING. Our Solution Architect team is more focusing to bring NVIDIA new technology into difference industries. We help to design the architecture of AI computing platform, analysis the AI and HPC applications to deliver our value to customers. This role will be instrumental in leveraging NVIDIA's cutting-edge technologies to optimize open-source and proprietary large models, create AI workflows, and support our customers in implementing advanced AI solutions. 

What you’ll be doing:

  • Drive the implementation and deployment of NVIDIA Inference Microservice (NIM) solutions 

  • Use NVIDIA NIM Factory Pipeline to package optimized models (including LLM, VLM, Retriever, CV, OCR, etc.) into containers providing standardized API access for on-prem or cloud deployment 

  • Refine NIM tools for the community, help the community to build their performant NIMs 

  • Design and implement agentic AI tailored to customer business scenarios using NIMs

  • Deliver technical projects, demos and client support tasks as directed by the Solution Architecture Leadership 

  • Provide technical support and guidance to customers, facilitating the adoption and implementation of NVIDIA technologies and products 

  • Collaborate with cross-functional teams to enhance and expand our AI solutions portfolio

  • Be an internal champion for NVIDIA software and total solutions in technical community 

  • Be an industry thought leader on integrating NVIDIA technology especially inference services into LHA, business partners and whole community 

  • Assist in supporting NVAIE team and driving NVAIE business in China 

What we need to see:

  • 3+ years working experience with Bachelor's or Master's degree in Computer Science, Artificial Intelligence, or a related field 

  • Proven experience in deploying and optimizing large language models 

  • Proficiency in at least one inference framework (e.g., TensorRT, ONNX Runtime, PyTorch) 

  • Strong programming skills in Python or C++ 

  • Familiarity with main stream inference engines (e.g., vLLM, SGLang) 

  • Experience with DevOps/MLOps such as Docker, Git, and CI/CD practices 

  • Excellent problem-solving skills and ability to troubleshoot complex technical issues 

  • Demonstrated ability to collaborate effectively across diverse, global teams, adapting communication styles while maintaining clear, constructive professional interactions 

Ways to stand out from the crowd:

  • Experience in architectural design for field LLM projects 

  • Expertise in model optimization techniques, particularly using TensorRT 

  • Knowledge of AI workflow design and implementation, experience on cluster resource management tools. Familiarity with agile development methodologies 

  • CUDA optimization experience, extensive experience designing and deploying large scale HPC and enterprise computing systems 

Similar Jobs

GoMotive - Software Engineer, MLOps

GoMotive

Pakistan (Remote)
1 Month ago
Rackspace Technology - AI/ML Architect

Rackspace Technology

Vietnam (Remote)
1 Month ago
Rackspace Technology - Google Cloud Engineer IV

Rackspace Technology

United States (Remote)
3 Months ago
The Walt Disney Company - Sr. Principal Software Engineer - Identity

The Walt Disney Company

New York, New York, United States (On-Site)
3 Months ago
Meta - Visiting Senior Research Scientist

Meta

Paris, Île-de-France, France (On-Site)
5 Months ago
Plarium - Director of Gen-AI

Plarium

Herzliya, Tel Aviv District, Israel (On-Site)
2 Months ago
Google - Customer Engineer, AI/ML, HCLS, Google Cloud

Google

Chicago, Illinois, United States (On-Site)
1 Week ago
Google - Field Solutions Architect, GenAI, Google Cloud

Google

Tokyo, Japan (On-Site)
1 Week ago
Keywords Studios - Research Associate - Fresher

Keywords Studios

Bengaluru, Karnataka, India (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Escape Velocity Entertainment - Site Reliability Engineer

Escape Velocity Entertainment

(Remote)
1 Month ago
Info Stretch - Lead Data Engineer

Info Stretch

Chennai, Tamil Nadu, India (On-Site)
5 Months ago
NVIDIA - Senior Software Engineer - Build and Deployment Tools

NVIDIA

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
Canva - Senior Software Engineer (Release Engineering/Python) - open to remote across ANZ

Canva

Sydney, New South Wales, Australia (Remote)
3 Months ago
Onward Search - Fullstack Engineer

Onward Search

San Jose, California, United States (On-Site)
1 Month ago
NXP - Junior Developer of Systems Testing Infrastructure

NXP

Brno, South Moravian Region, Czechia (On-Site)
7 Months ago
CData Software - Quality Assurance Automation Engineer

CData Software

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Blazesoft - .Net Developer

Blazesoft

Vaughan, Ontario, Canada (On-Site)
2 Months ago
Playtika - Senior DATA/AI SRE Engineer

Playtika

Poland (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Shanghai, Shanghai, China

NVIDIA - Customer Technical Program Manager

NVIDIA

Beijing, Beijing, China (On-Site)
1 Month ago
Tencent - Senior Animation Designer for Global Realistic 3A Action Game

Tencent

Shenzhen, Guangdong Province, China (On-Site)
4 Months ago
PwC - Technology Consulting-Full Stack Engineer-Associate-Shanghai

PwC

Shanghai, Shanghai, China (On-Site)
4 Months ago
Riot Games - Publishing Coordinator - Strategy Games

Riot Games

Shanghai, Shanghai, China (On-Site)
5 Months ago
NVIDIA - CSP Hardware Application Engineer

NVIDIA

Shenzhen, Guangdong Province, China (On-Site)
1 Month ago
Google - Analytical Lead, Apps and Fintech, Large Customer Sales

Google

Guangdong Province, China (On-Site)
1 Week ago
Ourpalm - Legal Business Partner

Ourpalm

Guangzhou, Guangdong Province, China (On-Site)
1 Week ago
NVIDIA - Performance Engineering Intern - 2025

NVIDIA

Shanghai, Shanghai, China (On-Site)
1 Month ago
TiMi Studio Group - Client Development Engineer for 3A Stylized Realistic Shooting Game

TiMi Studio Group

Shenzhen, Guangdong Province, China (On-Site)
2 Weeks ago
Ubisoft - Game Designer [Casual Party Game]

Ubisoft

Shanghai, Shanghai, China (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Google - Senior Software Engineer, AI/ML, Google Cloud Technical Infrastructure

Google

Kirkland, Washington, United States (On-Site)
1 Week ago
Meta - Software Engineer, Machine Learning

Meta

Mountain View, California, United States (On-Site)
5 Months ago
Microsoft - Engineering Manager

Microsoft

Mountain View, California, United States (Hybrid)
1 Month ago
Trend Micro - Large Language Models (LLM) Expert (VicOne_Automotive Security)

Trend Micro

Taipei City, Taiwan (On-Site)
6 Months ago
Meetelise - Senior Research Scientist

Meetelise

(Remote)
5 Months ago
Microsoft - Research Intern - GenAI

Microsoft

Redmond, Washington, United States (On-Site)
1 Week ago
Google - Software Engineer, AICore, Knowledge and Information

Google

Taipei City, Taiwan (On-Site)
1 Week ago
Meta - Research Scientist, Computer Vision for Generative AI (PhD)

Meta

Menlo Park, California, United States (On-Site)
5 Months ago
Arrise Solutions (India)   - Senior ML Engineer

Arrise Solutions (India)

Hyderabad, Telangana, India (On-Site)
7 Months ago
Birdeye - Senior Product Manager

Birdeye

Gurugram, Haryana, India (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug