Senior System Networking Engineer, InfiniBand

3 Months ago • 8 Years + • Artificial Intelligence • Research & Development

Job Summary

Job Description

NVIDIA seeks a Senior System Networking Engineer specializing in InfiniBand for its HPC/AI E2E Verification team. This role involves designing and implementing innovative architectures for high-performance computing systems, focusing on scalability, performance, and functionality of NVIDIA InfiniBand HPC/AI solutions. Responsibilities include collaborating with cross-functional teams, planning and executing end-to-end test scenarios, analyzing results, generating reports, and driving process improvements. The ideal candidate possesses in-depth InfiniBand, Linux networking, and HPC/AI experience, with a strong analytical and problem-solving aptitude. Experience with AI application benchmarks and distributed job scheduling is highly desirable.
Must have:
  • 8+ years experience in networking
  • InfiniBand XDR/NDR knowledge
  • Linux networking expertise
  • Strong analytical & problem-solving skills
  • HPC/AI architecture understanding
Good to have:
  • AI Application benchmarks experience
  • Distributed job scheduling expertise
  • Nvidia Networking & AI architecture knowledge

Job Details

NVIDIA is looking for an outstanding candidate for a Senior System Networking Engineer role in HPC/AI E2E Verification team. Be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in InfiniBand networking technologies and High-performance computing. You will work with the latest InfiniBand based Switches, HCAs, AI servers and Software, together with many researchers, Architects and developers leading differentiated InfiniBand HPC/AI solutions.

What You Will Be Doing:

  • As a Senior System InfiniBand Networking Engineer, you will play a crucial role in crafting and implementing innovative architectures for high-performance computing systems, enabling efficient and scalable computation for AI/ML applications and HPC Benchmarks

  • Collaborating closely with multi-functional teams, including hardware engineers, software developers, and domain experts, to deliver optimized solutions that meet the demanding requirements of HPC/AI workloads

  • Planning, Reviewing, and Executing complexed End-to-End scenarios with strong emphasis on Scalability, Performance, and Functionality of NVIDIA InfiniBand HPC/AI solutions ensuring alignment with Nvidia Networking & AI specifications

  • Analyzing test results and generating detailed reports for stakeholders to facilitate informed decision-making

  • Drive continuous improvement initiatives, identifying opportunities to enhance verification processes and methodologies in the context of Nvidia Networking & AI solutions

What We Need To See:

  • Bachelor's/Master’s degree in electrical engineering, Computer Science, or equivalent experience in Networking/System field

  • 8+ years experience driving large-scale complexed solutions with strong emphasis on networking troubleshooting and Performance analysis

  • In depth experience and understanding of Linux based networking systems

  • Strong analytical and problem-solving skills

  • Excellent communication and interpersonal skills

  • Ability to work effectively in a collaborative, fast-paced environment

Ways To Stand Out From The Crowd:

  • In-depth knowledge of InfiniBand XDR/NDR technology, Nvidia Networking, and AI architectures, protocols, and standards

  • Expertise in High-performance computing and Machine learning

  • Experience in AI Application benchmarks and Distributed job scheduling

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Similar Jobs

NVIDIA - Senior AI-HPC Cluster Engineer

NVIDIA

Westford, Massachusetts, United States (Hybrid)
3 Weeks ago
Google - Systems Development Engineer III

Google

Reston, Virginia, United States (On-Site)
4 Days ago
Google - UX Designer, Chrome Identity

Google

Warsaw, Masovian Voivodeship, Poland (On-Site)
4 Days ago
ByteDance - Backend Engineer, Video-On-Demand - 2025 Start

ByteDance

Singapore (On-Site)
5 Months ago
ByteDance - Software Engineer, AML Machine Learning Systems

ByteDance

Seattle, Washington, United States (On-Site)
3 Weeks ago
NetEase Games - Game AI Research Leader

NetEase Games

Singapore (On-Site)
2 Months ago
Meta - Software Engineer, Machine Learning

Meta

Austin, Texas, United States (Remote)
1 Week ago
NVIDIA - Senior Software Engineer - Automated Parallel Programming

NVIDIA

North Carolina, United States (Remote)
1 Month ago
ByteDance - Researcher Graduate (Applied Machine Learning - Enterprise)

ByteDance

San Jose, California, United States (On-Site)
1 Week ago
Microsoft - Principal Researcher – Generative AI

Microsoft

Redmond, Washington, United States (On-Site)
2 Days ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Omnissa - Staff Engineer (C++ Linux)

Omnissa

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
ByteDance - Server Deployment Specialist, Infrastructure Engineering

ByteDance

Kulai, Johor, Malaysia (On-Site)
1 Week ago
Pegasystems - Cloud Security Engineer

Pegasystems

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Google - Data Center Technician (Weekend Shift)

Google

Reno, Nevada, United States (On-Site)
4 Days ago
ByteDance - Tech Lead Machine Learning Engineer

ByteDance

Seattle, Washington, United States (On-Site)
3 Weeks ago
NVIDIA - Senior Software Engineer

NVIDIA

Ra'anana, Center District, Israel (On-Site)
2 Months ago
DNEG - Creature TD - CFX

DNEG

Mumbai, Maharashtra, India (On-Site)
3 Weeks ago
Luxoft - Senior C++ Developer with Android experience

Luxoft

Gothenburg, Västra Götaland County, Sweden (On-Site)
5 Months ago
The Walt Disney Company - Technical Assistant

The Walt Disney Company

London, England, United Kingdom (Hybrid)
2 Months ago
NVIDIA - Senior Software Engineer, Infrastructure

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Day ago

Get notifed when new similar jobs are uploaded

Jobs in Yokne'am Illit, North District, Israel

NVIDIA - Senior Software Verification Engineer - Switch Simulation

NVIDIA

Yokne'am Illit, North District, Israel (On-Site)
3 Months ago
SciPlay - Unity Developer

SciPlay

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
2 Weeks ago
NVIDIA - Senior Chip Design Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (Hybrid)
2 Months ago
Vi - Senior Software Engineer

Vi

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
3 Weeks ago
Playtika - Level Design Team Leader

Playtika

Israel (On-Site)
5 Months ago
Playtika - Product Security Team Leader

Playtika

Israel (On-Site)
4 Months ago
Google - SoC and IP Design Engineer

Google

Haifa, Haifa District, Israel (On-Site)
4 Days ago
NVIDIA - Senior Software Engineer

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Month ago
NVIDIA - Senior Firmware Design Engineer

NVIDIA

Yokne'am Illit, North District, Israel (Hybrid)
3 Months ago
NVIDIA - Senior Software Research Architect

NVIDIA

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Google - Software Engineer, Machine Learning Compilers

Google

Bengaluru, Karnataka, India (On-Site)
4 Days ago
My Fitness Pal - Staff Machine Learning Engineer

My Fitness Pal

United States (Remote)
2 Months ago
The Walt Disney Company - Senior Machine Learning Engineer - Ad Platforms

The Walt Disney Company

Santa Monica, California, United States (On-Site)
1 Week ago
Google - Mobile Device Software Engineer, Machine Learning Runtime, Silicon

Google

Mountain View, California, United States (On-Site)
4 Days ago
Google - Artificial Intelligence Sales Specialist III

Google

San Francisco, California, United States (On-Site)
1 Day ago
Google - Senior Software Engineer, AI/ML GenAI, Google Cloud AI

Google

Kirkland, Washington, United States (On-Site)
1 Week ago
ByteDance - Cloud Native Engineer, ARK Large Model Platform (Singapore)

ByteDance

Singapore (On-Site)
5 Months ago
NVIDIA - Senior Field Application Engineer

NVIDIA

Westford, Massachusetts, United States (On-Site)
1 Month ago
NVIDIA - Senior Solutions Architect, Global Partner Team

NVIDIA

Canada (On-Site)
3 Months ago
Tencent - Artificial General Intelligence Research Internship

Tencent

Washington, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and is fueling the creation of the metaverse. NVIDIA is now a full-stack computing company with data-center-scale offerings that are reshaping industry.

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

Hyderabad, Telangana, India (On-Site)

Pune, Maharashtra, India (On-Site)

Pune, Maharashtra, India (On-Site)

Yokne'am Illit, North District, Israel (On-Site)

Shenzhen, Guangdong Province, China (On-Site)

Taipei City, Taiwan (On-Site)

California, United States (Remote)

Yokne'am Illit, North District, Israel (On-Site)

View All Jobs

Get notified when new jobs are added by NVIDIA

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug