AI Network System Architect

1 Month ago • 2 Years + • Artificial Intelligence

Job Summary

Job Description

NVIDIA seeks a Senior AI Network System Architect to design and develop next-generation networking products for high-performance and ML/AI computing. Responsibilities include investigating emerging technologies in ML/AI, executing workloads, profiling and analyzing bottlenecks, optimizing communication libraries (NCCL, UCX), conceptualizing next-generation networking products, developing simulation models, and collaborating with multi-functional teams. The role requires expertise in ML/AI workloads, distributed training, large-scale network behavior, and simulation environments. Experience with communication libraries, network protocols (InfiniBand, IP, TCP, RoCE), and programming languages (Python, C++) is highly desirable.
Must have:
  • M.Sc./Ph.D. in CS/CE/EE
  • 2+ years experience in computer networks
  • Expertise in ML/AI workloads
  • Understanding of large-scale network behavior
  • Simulation environment development
  • Problem-solving and critical thinking
Good to have:
  • Knowledge of NCCL, UCX, UCC
  • Knowledge of InfiniBand, IP, TCP, RoCE
  • Experience with Python, C++, Docker
  • System engineering expertise
  • Experience with DLRM, LLM, or generative AI

Job Details

Our technology has no boundaries! NVIDIA is building the world’s most groundbreaking and state-of-the-art accelerated computing platforms. Because of our work, scientists, researchers, and engineers can advance their ideas. We pioneered a supercharged form of computing loved by the fastest-paced computer users in the world - scientists, designers, artists, and gamers.

We seek a highly motivated Senior AI Network System Architect to join our team of experts and help shape the future of high-performance and ML / AI computing. Our next-generation Infiniband, NVLink, and Ethernet systems will be at the forefront of connecting and powering the world's most advanced AI clusters. As an AI system architect at NVIDIA, you will have the opportunity to work on some of the most cutting-edge technology and help drive the innovation of our next-generation networks that top researchers and engineers worldwide will use.

What You’ll Be Doing:

  • Investigating emerging technologies and methodologies in ML and AI to discern their interactions with network infrastructure.

  • Executing workloads on AI systems, conducting profiling, and analyzing bottlenecks and possible enhancements.

  • Conducting research and implementing optimizations for communication libraries like NCCL and UCX.

  • Spearheading the conceptualization of next-generation networking products tailored to support and accelerate state-of-the-art ML workloads.

  • Develop models for simulations, analyze simulation results, and develop optimization algorithms.

  • Collaborate with multi-functional teams, including other architecture teams, logic design, system software, firmware, and ML research teams, to ensure the successful execution of the project.

What We Need To See:

  • M.Sc, or Ph. D degree in Computer Science, Computer Engineering, or Electrical Engineering.

  • At least 2+ years of industry or research experience in computer networks.

  • Extensive expertise in ML/AI workloads, particularly in distributed training.

  • Excellent understanding of large-scale network behavior and the effect of distributed computing workloads on the network.

  • Experience in the development of simulation environments.

  • Great problem-solving and critical-thinking skills.

  • Ability to thrive in a fast-paced and dynamic environment is necessary.

  • Work concurrently with multiple groups in the organization.

Ways To Stand Out Of The Crowd:

  • Knowledge of communication libraries such as NCCL, UCX, and UCC.

  • Good knowledge of network protocols - such as InfiniBand, IP, TCP, RoCE, and network topologies.

  • Experience with Python, C++, and dockers.

  • Expertise in system engineering, operations research, and intricate hardware-software integrated systems.

  • Demonstrated experience in DLRM, LLM or other generative AI.

NVIDIA has some of the most forward-thinking and hardworking people in the world working for us, and due to unprecedented growth, our world-class engineering teams are growing fast. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.

We are committed to fostering a diverse work environment and are proud to be an equal-opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, perform essential job functions, and receive other benefits and privileges of employment. Please contact us to request accommodation.

Similar Jobs

Cadence - C++ Software Engineer II

Cadence

Burlington, Massachusetts, United States (On-Site)
1 Month ago
Rebellion - Senior AI Gameplay Programmer

Rebellion

Warwick, England, United Kingdom (Hybrid)
2 Months ago
Google - Staff Software Engineer, YouTube

Google

Mountain View, California, United States (On-Site)
1 Month ago
Cadence - Intern: Software Engineering - Verisium Debug R&D

Cadence

Brazil (On-Site)
1 Week ago
Opendoor - Software Engineer - Fullstack (Go, SQL, Typescript)

Opendoor

San Francisco, California, United States (Hybrid)
5 Days ago
bytedance - Research Scientist Intern (Doubao (Seed) - Machine Learning System) - 2025 Summer (PhD)

bytedance

Seattle, Washington, United States (On-Site)
7 Months ago
Google - CPU AI Workloads and Performance Architect

Google

Mountain View, California, United States (On-Site)
1 Month ago
bytedance - Student Researcher (Doubao (Seed) - Foundation Model - Vision Generative AI)

bytedance

San Jose, California, United States (On-Site)
2 Months ago
Meta - AI Research Scientist, Language - Generative AI

Meta

New York, New York, United States (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Western Digital - Data Scientist (with Master's Degree)

Western Digital

Biñan, Calabarzon, Philippines (On-Site)
2 Months ago
GameJobs - Principal Geometry Engineer

GameJobs

Cary, North Carolina, United States (On-Site)
1 Month ago
Google - Staff Software Engineer, Platforms

Google

Sunnyvale, California, United States (On-Site)
1 Month ago
Evoplay - SEO Specialist

Evoplay

Kyiv, Kyiv City, Ukraine (Remote)
3 Months ago
Morningstar - Senior Software Development Engineer, ML Operations

Morningstar

Mumbai, Maharashtra, India (Hybrid)
3 Days ago
Yahoo - DevOps Engineer

Yahoo

Ireland (Hybrid)
3 Days ago
Auros Global - Middle Office Developer

Auros Global

(Remote)
2 Weeks ago
FICO - Senior C++ Engineer (Parallel Processing, Low Latency)

FICO

United States (Remote)
1 Month ago
paxie games - Data Scientist

paxie games

Göztepe, İstanbul, Türkiye (On-Site)
6 Months ago
Google - Software Engineer III, Infrastructure, Google Cloud Global Networking

Google

Atlanta, Georgia, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Yokne'am Illit, North District, Israel

Philips - Technical Writing Lead

Philips

Ra'anana, Center District, Israel (Hybrid)
1 Week ago
Illuminia - Part-Time Technical Writer (Student)

Illuminia

Tel Aviv-Yafo, Tel Aviv District, Israel (Hybrid)
3 Days ago
Tesla - Service Technician

Tesla

Netanya, Center District, Israel (On-Site)
3 Months ago
Playtika - UX Researcher

Playtika

Israel (On-Site)
2 Months ago
Varonis Internal - Director Of Product Security

Varonis Internal

Herzliya, Tel Aviv District, Israel (Hybrid)
2 Months ago
NVIDIA - SDK Ethernet Software Team Manager

NVIDIA

Ra'anana, Center District, Israel (On-Site)
4 Months ago
Aristocrat - Anaxi - MS SQL Server DBA Production Team leader

Aristocrat - Anaxi

Tel Aviv-Yafo, Tel Aviv District, Israel (Hybrid)
3 Days ago
Playtika - Senior User Acquisition- Growth

Playtika

Israel (On-Site)
5 Months ago
Playtika - Games R&D-Monetization Operations Specialist

Playtika

Israel (On-Site)
6 Months ago
SciPlay - Lead Art

SciPlay

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

PwC - Manager_Conversational AI Developer_Advisory Corporate_Advisory_Bangalore

PwC

Bengaluru, Karnataka, India (On-Site)
8 Months ago
Ello - Tech Lead, GenAI & Machine Learning

Ello

San Francisco, California, United States (On-Site)
1 Month ago
Google - Student Researcher, PhD, Winter/Summer 2025

Google

Mountain View, California, United States (On-Site)
6 Months ago
zoox - Senior/Staff Software Engineer, ML Performance Optimization

zoox

Seattle, Washington, United States (On-Site)
7 Months ago
bytedance - Research Scientist (Computational Biology - AI-for-Science)

bytedance

Seattle, Washington, United States (On-Site)
2 Months ago
bytedance - Student Researcher (Doubao (Seed) - Foundation Model - MultiModal Generative Model)

bytedance

San Jose, California, United States (On-Site)
2 Months ago
Google - Silicon Design Verification Engineer, TPU, Google Cloud

Google

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Electronic Arts - Senior Manager, Generative AI Software Engineering

Electronic Arts

Orlando, Florida, United States (On-Site)
1 Month ago
Sony Interactive Entertainment - Learning and Development Specialist (AI Talent Development & Training Program Lead)

Sony Interactive Entertainment

Tokyo, Japan (On-Site)
1 Month ago
AI Fund - Founder-in-Residence

AI Fund

United States (Remote)
7 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Pune, Maharashtra, India (On-Site)

Taipei City, Taiwan (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug