AI Network System Architect

2 Months ago • 2 Years +

Job Summary

Job Description

NVIDIA seeks a Senior AI Network System Architect to design and develop next-generation networking products for high-performance and ML/AI computing. Responsibilities include investigating emerging technologies in ML/AI, executing workloads, profiling and analyzing bottlenecks, optimizing communication libraries (NCCL, UCX), conceptualizing next-generation networking products, developing simulation models, and collaborating with multi-functional teams. The role requires expertise in ML/AI workloads, distributed training, large-scale network behavior, and simulation environments. Experience with communication libraries, network protocols (InfiniBand, IP, TCP, RoCE), and programming languages (Python, C++) is highly desirable.
Must have:
  • M.Sc./Ph.D. in CS/CE/EE
  • 2+ years experience in computer networks
  • Expertise in ML/AI workloads
  • Understanding of large-scale network behavior
  • Simulation environment development
  • Problem-solving and critical thinking
Good to have:
  • Knowledge of NCCL, UCX, UCC
  • Knowledge of InfiniBand, IP, TCP, RoCE
  • Experience with Python, C++, Docker
  • System engineering expertise
  • Experience with DLRM, LLM, or generative AI

Job Details

Our technology has no boundaries! NVIDIA is building the world’s most groundbreaking and state-of-the-art accelerated computing platforms. Because of our work, scientists, researchers, and engineers can advance their ideas. We pioneered a supercharged form of computing loved by the fastest-paced computer users in the world - scientists, designers, artists, and gamers.

We seek a highly motivated Senior AI Network System Architect to join our team of experts and help shape the future of high-performance and ML / AI computing. Our next-generation Infiniband, NVLink, and Ethernet systems will be at the forefront of connecting and powering the world's most advanced AI clusters. As an AI system architect at NVIDIA, you will have the opportunity to work on some of the most cutting-edge technology and help drive the innovation of our next-generation networks that top researchers and engineers worldwide will use.

What You’ll Be Doing:

  • Investigating emerging technologies and methodologies in ML and AI to discern their interactions with network infrastructure.

  • Executing workloads on AI systems, conducting profiling, and analyzing bottlenecks and possible enhancements.

  • Conducting research and implementing optimizations for communication libraries like NCCL and UCX.

  • Spearheading the conceptualization of next-generation networking products tailored to support and accelerate state-of-the-art ML workloads.

  • Develop models for simulations, analyze simulation results, and develop optimization algorithms.

  • Collaborate with multi-functional teams, including other architecture teams, logic design, system software, firmware, and ML research teams, to ensure the successful execution of the project.

What We Need To See:

  • M.Sc, or Ph. D degree in Computer Science, Computer Engineering, or Electrical Engineering.

  • At least 2+ years of industry or research experience in computer networks.

  • Extensive expertise in ML/AI workloads, particularly in distributed training.

  • Excellent understanding of large-scale network behavior and the effect of distributed computing workloads on the network.

  • Experience in the development of simulation environments.

  • Great problem-solving and critical-thinking skills.

  • Ability to thrive in a fast-paced and dynamic environment is necessary.

  • Work concurrently with multiple groups in the organization.

Ways To Stand Out Of The Crowd:

  • Knowledge of communication libraries such as NCCL, UCX, and UCC.

  • Good knowledge of network protocols - such as InfiniBand, IP, TCP, RoCE, and network topologies.

  • Experience with Python, C++, and dockers.

  • Expertise in system engineering, operations research, and intricate hardware-software integrated systems.

  • Demonstrated experience in DLRM, LLM or other generative AI.

NVIDIA has some of the most forward-thinking and hardworking people in the world working for us, and due to unprecedented growth, our world-class engineering teams are growing fast. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.

We are committed to fostering a diverse work environment and are proud to be an equal-opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, perform essential job functions, and receive other benefits and privileges of employment. Please contact us to request accommodation.

Similar Jobs

bytedance - Research Scientist in Foundation Model (Music) - 2025 Start (PhD)

bytedance

San Jose, California, United States (On-Site)
8 Months ago
Wind River - Senior Member of Technical Staff - Filesystems

Wind River

Bengaluru, Karnataka, India (Hybrid)
1 Month ago
Rebellion - Gameplay Programmer

Rebellion

Runcorn, England, United Kingdom (Hybrid)
3 Months ago
Ion - Lead Software Engineer, Italy

Ion

Collecchio, Emilia-Romagna, Italy (On-Site)
8 Months ago
Apple - Sr Software Engineer- Time Synchronization

Apple

Boulder, Colorado, United States (On-Site)
1 Month ago
Ubisoft - Senior ML Data Scientist

Ubisoft

Montreal, Quebec, Canada (On-Site)
5 Months ago
Eqvilent - DL RESEARCHER

Eqvilent

(Remote)
7 Months ago
Airlab Inc  - C++ & Python Programmer

Airlab Inc

Montreal, Quebec, Canada (On-Site)
11 Months ago
bytedance - Research Scientist- Foundation Model, Video Generation

bytedance

Seattle, Washington, United States (On-Site)
8 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Outscal - C++ Game Programmer (All levels)

Outscal

(Remote)
1 Year ago
playrix  - Senior Technical Designer

playrix

Ireland (Remote)
8 Months ago
Ion - Equities Service Desk - Trader Support - 7464

Ion

Woking, England, United Kingdom (On-Site)
8 Months ago
Qualcomm - ASICS Design Verification Engineer

Qualcomm

Santa Clara, California, United States (On-Site)
1 Month ago
eBay - Lead Technical Program Manager

eBay

Portland, Oregon, United States (Hybrid)
3 Weeks ago
NVIDIA - Senior Digital Design Verification Engineer - Hardware

NVIDIA

Santa Clara, California, United States (On-Site)
5 Months ago
Light Speed Studios - Senior Engine Engineer

Light Speed Studios

(On-Site)
6 Months ago
Qualcomm - Systems Test Engineer, Senior

Qualcomm

Colombes, Île-de-France, France (On-Site)
1 Month ago
Ansys - R&D Engineer II - C++

Ansys

Chengdu, Sichuan, China (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Yokne'am Illit, North District, Israel

entrata - Backend Developer

entrata

Tel Aviv-Yafo, Tel Aviv District, Israel (Hybrid)
8 Months ago
Varonis  - Cloud Security Research Team Leader

Varonis

Herzliya, Tel Aviv District, Israel (On-Site)
8 Months ago
GameJobs - Product Manager

GameJobs

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Year ago
Playtika - Director of Marketing, Growth Ventures

Playtika

Israel (On-Site)
2 Months ago
Playtika - LMS Operation & Learning Partner

Playtika

Israel (On-Site)
2 Months ago
Playtika - Software Architect

Playtika

Israel (On-Site)
5 Months ago
CyberArk - Principal Program Manager - AI Excellence

CyberArk

Israel (Hybrid)
1 Month ago
NVIDIA - Power Integrity Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
4 Months ago
NVIDIA - Senior Software Architect, Accelerated Computing SDN

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
5 Months ago
CyberArk - Senior Penetration Testing Researcher

CyberArk

Israel (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Category Jobs

Meta - Software Engineer, Machine Learning

Meta

Burlingame, California, United States (On-Site)
7 Months ago
Keywords Studios - Research Associate - AI

Keywords Studios

(Remote)
3 Months ago
Blitz app - Lead AI Engineer (Generative & 3D Modeling Expertise)

Blitz app

Tesistán, Jalisco, Mexico (On-Site)
5 Months ago
Twitch - Sr. Applied Scientist

Twitch

San Francisco, California, United States (On-Site)
3 Months ago
AI Fund - Curriculum Developer

AI Fund

(Remote)
8 Months ago
Hedra - Senior Research Engineer

Hedra

New York, New York, United States (On-Site)
3 Months ago
Microsoft - Member of Technical Staff, AI Pretraining

Microsoft

London, England, United Kingdom (On-Site)
3 Months ago
Zazz - Artificial Intelligence Engineer

Zazz

(Remote)
4 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Santa Clara, California, United States (On-Site)

Massachusetts, United States (On-Site)

Santa Clara, California, United States (On-Site)

Texas, United States (On-Site)

Santa Clara, California, United States (Hybrid)

Santa Clara, California, United States (Hybrid)

Pune, Maharashtra, India (On-Site)

Taipei City, Taiwan (On-Site)

Beijing, Beijing, China (On-Site)

Santa Clara, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug