Senior Software Engineer, Deep Learning Inference

1 Month ago • 5 Years + • Research & Development

Job Summary

Job Description

NVIDIA seeks a Senior Software Engineer passionate about performance optimization and generative AI. Responsibilities include collaborating with research teams to onboard LLMs and VLMs into NVIDIA's open-source AI runtimes, optimizing inference workloads, building robust inference software systems, implementing low-level GPU code, and owning end-to-end inference acceleration features. This role involves working with diverse teams to deliver production-grade products, requiring strong software design principles, proficiency in system and scripting languages, and a deep understanding of machine learning concepts.
Must have:
  • 5+ years software engineering experience
  • Profound knowledge of software design principles
  • Proficiency in system & scripting languages
  • Strong machine learning concepts
  • Excellent communication & teamwork skills
Good to have:
  • Familiarity with NVIDIA's DL software stack (Triton, TensorRT-LLM, Model Optimizer)
  • Experience with performance modeling, profiling, debugging on NVIDIA accelerators
  • Knowledge of LLM quantization, fine-tuning, and caching algorithms
  • GPU kernel programming (CUDA or OpenCL)
  • Experience on large software projects (50+ contributors)

Job Details

NVIDIA has been at the forefront of the deep learning revolution, pioneering innovations that have transformed the entire field. As the leading provider of GPUs and AI computing platforms, NVIDIA has empowered researchers and engineers worldwide to accelerate breakthroughs in artificial intelligence.

We seek a versatile Senior Software Engineer who is passionate about performance optimization and generative AI. Our team builds software solutions that enable efficient inference on the latest and greatest generative AI models. We tackle problems on all levels of the stack—from server-level request batching to GPU kernel fusion—and collaborate with teams across diverse disciplines to push Nvidia's hardware to its full potential.

What you’ll be doing:

  • Cooperate with research teams to onboard new LLMs and VLMs into Nvidia's opensource AI runtimes

  • Optimize inference workloads using sophisticated profiling and simulation tools

  • Build SOLID, extendable inference software systems, and refine robust APIs

  • Implement and debug low-level GPU code to harness the latest HW features

  • Own end-to-end inference acceleration features and work with teams around the world to deliver production-grade products

What we need to see:

  • B.Sc., M.Sc. or equivalent experience in Computer Science or Computer Engineering

  • 5+ years of relevant hands-on software engineering experience

  • Profound knowledge of software design principles

  • Strong proficiency in at least one system and one scripting language

  • Strong grasp of machine learning concepts

  • People person with excellent communication skills that enjoys collaboration and teamwork.

Ways to stand out from the crowd:

  • Familiarity with Nvidia's DL software stack, e.g. Triton Inference Server, TensorRT-LLM, and Model Optimizer

  • Proven track record of performance modeling, profiling, debugging, and development in a performance-critical setting with Nvidia's accelerators.

  • Familiarity with LLM quantization, fine-tunning, and caching algorithms

  • Proficiency in GPU kernel programming (CUDA or OpenCL)

  • Prior experience working on a large software project with 50+ contributors

NVIDIA is widely considered one of the world’s most desirable employers in the technology field. We have some of the most forward-thinking and hardworking people working for us. If you're creative and autonomous, we want to hear from you! We are committed to fostering a diverse work environment and are proud to be an equal-opportunity employer. We highly value diversity in our current and future employees. We do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.

Similar Jobs

ByteDance - Research Scientist Graduate (Foundation Models for Science - ByteDance Research) - 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
7 Months ago
Armada - Senior Data Engineer

Armada

Thiruvananthapuram, Kerala, India (On-Site)
8 Months ago
Appier - Senior Software Engineer, Java Backend Development

Appier

Taipei City, Taiwan (On-Site)
2 Weeks ago
INTEL - Foundational AI Research Scientist

INTEL

United States (Remote)
3 Weeks ago
NVIDIA - Senior Observability Architect, AI and HPC

NVIDIA

Santa Clara, California, United States (On-Site)
3 Months ago
Google - CPU Architecture Lead

Google

Poughkeepsie, New York, United States (On-Site)
1 Month ago
ByteDance - Student Researcher (Foundation Models - Reasoning, Planning & Agent) - Doubao (Seed) - 2025 Start (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
7 Months ago
Ajmera Infotech - Technical Writer

Ajmera Infotech

Hyderabad, Telangana, India (On-Site)
6 Months ago
Google - Full Chip CAD and Analog Design Engineer

Google

Fremont, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

NVIDIA - Senior System Software Engineer, GPU Server

NVIDIA

Santa Clara, California, United States (On-Site)
4 Months ago
Xsolla - Senior Software Engineer (Monetization)

Xsolla

Kuala Lumpur, Federal Territory Of Kuala Lumpur, Malaysia (On-Site)
1 Month ago
QuinStreet - Senior Account Manager, Banking and Wealth Management

QuinStreet

United States (Remote)
1 Week ago
The Walt Disney Company - Senior Principal Software Engineer

The Walt Disney Company

San Francisco, California, United States (On-Site)
1 Month ago
ByteDance - Software Development Engineer Graduate (Distributed NoSQL Database Systems)

ByteDance

Seattle, Washington, United States (On-Site)
4 Months ago
DNEG - Software Developer – 2D Imaging and Nuke Tools

DNEG

Mumbai, Maharashtra, India (On-Site)
2 Months ago
Google - Senior Research Engineer, AI/ML

Google

London, England, United Kingdom (On-Site)
1 Month ago
Blind Squirrel Games - Senior Graphics Engineer

Blind Squirrel Games

California, United States (Remote)
2 Months ago
Google - Senior Embedded Engineer, Security/Privacy, Pixel

Google

Warsaw, Masovian Voivodeship, Poland (On-Site)
1 Month ago
NVIDIA - Signal and Power Integrity Engineer (RDSS Intern)

NVIDIA

Taipei City, Taiwan (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Ramat Gan, Tel Aviv District, Israel

Varonis Internal - DevOps Engineer

Varonis Internal

Herzliya, Tel Aviv District, Israel (Hybrid)
2 Months ago
NVIDIA - Senior Test Product Engineer

NVIDIA

Yokne'am Illit, North District, Israel (On-Site)
4 Months ago
Google - CPU Logic Design Engineer

Google

Haifa, Haifa District, Israel (On-Site)
1 Month ago
CyberArk - Security Research Team Leader

CyberArk

Israel (Hybrid)
3 Weeks ago
Lytx - Senior DevOps Engineer

Lytx

Haifa, Haifa District, Israel (Hybrid)
1 Month ago
Playtika - Product Manager

Playtika

Israel (On-Site)
7 Months ago
Flexra Software - Product Manager

Flexra Software

Tel Aviv-Yafo, Tel Aviv District, Israel (Hybrid)
3 Weeks ago
Varonis Internal - Commission Administrator-Student Position

Varonis Internal

Herzliya, Tel Aviv District, Israel (Hybrid)
3 Weeks ago
NVIDIA - Senior System Software Architect, HPC Networking

NVIDIA

Yokne'am Illit, North District, Israel (On-Site)
4 Months ago
NVIDIA - SOC Verification Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Research & Development Jobs

Passive Logic - Weather Simulation Engineer

Passive Logic

Salt Lake City, Utah, United States (On-Site)
5 Months ago
Nintendo - Senior Engineer, Device Driver (NTD)

Nintendo

Redmond, Washington, United States (On-Site)
4 Months ago
Google - EDA/CAD Custom Tool Development Engineer

Google

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Riot Games - Staff Software Engineer - UnEco

Riot Games

Los Angeles, California, United States (On-Site)
2 Months ago
NVIDIA - Senior Timing Methodology Engineer

NVIDIA

Austin, Texas, United States (On-Site)
3 Months ago
Tesla - Electrical Engineering - Motor Design, Tesla Bot Internship

Tesla

Athens, Greece (On-Site)
3 Months ago
NVIDIA - Hardware Application Engineer, Ethernet Switch

NVIDIA

Shanghai, Shanghai, China (Hybrid)
4 Months ago
Google - Signal and Power Integrity Engineer, Machine Learning

Google

Sunnyvale, California, United States (On-Site)
1 Month ago
ByteDance - Research Engineer- Foundation Model AI Platform- San Jose

ByteDance

San Jose, California, United States (On-Site)
7 Months ago
Google - CPU Design Manager, Hardware

Google

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Pune, Maharashtra, India (On-Site)

Taipei City, Taiwan (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug