Network Engineer (HPC/RDMA)

3 Months ago • 5 Years +
Network Engineering

Job Description

TensorWave is seeking a passionate HPC/RDMA Engineer to join their IT team. The role involves designing and implementing innovative networking solutions to support high-performance AI workloads and cloud services. Responsibilities include exploring and integrating new network fabrics to enhance platform performance and scalability, ensuring network reliability, performance, and security for AI projects utilizing AMD and NVIDIA GPU technologies, and troubleshooting complex networking issues. The ideal candidate will have at least 5 years of experience in network engineering, strong knowledge of BGP, Ethernet protocols, RoCEv2, and network security practices, and experience with or interest in new network technologies for AI and cloud computing. This position offers opportunities for growth and creative problem-solving.
Good To Have:
  • Bachelor's degree in CS/IT
  • Interest in new network fabrics
Must Have:
  • 5+ years in network engineering
  • Focus on HPC/AI networking
  • Knowledge of BGP, Ethernet, RoCEv2
  • Network security practices
  • Familiarity with AMD/NVIDIA GPUs
  • Problem-solving skills
Perks:
  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance
  • Life and Voluntary Supplemental Insurance
  • Short Term Disability Insurance
  • Flexible Spending Account
  • 401(k)
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Mental Health Benefits

Add these skills to join the top 1% applicants for this job

networking

At TensorWave, we’re leading the charge in AI compute, building a versatile cloud platform that’s driving the next generation of AI innovation. We’re focused on creating a foundation that empowers cutting-edge advancements in intelligent computing, pushing the boundaries of what’s possible in the AI landscape.

About the Role:

We are looking for a HPC/RDMA Engineer with a passion for AI and advanced networking technologies. The ideal candidate will support our vision by developing and managing a networking infrastructure that underpins our innovative AI cloud services. This role involves exploring and integrating new types of network fabrics to enhance our platform's performance and scalability, ensuring optimal operation for our clients' AI projects.

Responsibilities:

  • Collaborate with a dynamic IT team to design and implement innovative networking solutions that meet the demands of high-performance AI workloads.

  • Lead initiatives to explore and integrate new types of network fabrics, enhancing the scalability and efficiency of our AI infrastructure.

  • Ensure network reliability, performance, and security for cloud services, optimizing for both AMD and NVIDIA GPU technologies.

  • Work closely with the AI development team to align networking strategies with the overall goals of TensorWave's cloud platform.

  • Troubleshoot and resolve complex networking issues, providing expert guidance and solutions to maintain high service levels.

Essential Skills & Qualifications:

  • Bachelor’s degree in Computer Science, Information Technology, or related field.

  • At least 5 years of relevant experience in network engineering, with a focus on supporting high-performance computing (HPC) and AI applications.

  • Strong knowledge of BGP, Ethernet protocols, RoCEv2, and network security practices.

  • Experience with or keen interest in exploring new network fabrics and technologies, particularly in the context of AI and cloud computing.

  • Familiarity with AMD and NVIDIA GPU ecosystems and their impact on network performance and configuration.

  • Exceptional problem-solving abilities and a commitment to innovation in networking for AI applications.

We’re looking for resilient, adaptable people to join our team—folks who enjoy collaborating and tackling tough challenges. We’re all about offering real opportunities for growth, letting you dive into complex problems and make a meaningful impact through creative solutions. If you're a driven contributor, we encourage you to explore opportunities to make an impact at TensorWave. Join us as we redefine the possibilities of intelligent computing.

What We Bring:

In addition to a competitive salary, we offer a variety of benefits to support your needs, including:

  • Stock Options

  • 100% paid Medical, Dental, and Vision insurance 

  • Life and Voluntary Supplemental Insurance

  • Short Term Disability Insurance

  • Flexible Spending Account

  • 401(k)

  • Flexible PTO

  • Paid Holidays

  • Parental Leave

  • Mental Health Benefits through Spring Health

Set alerts for more jobs like Network Engineer (HPC/RDMA)
Set alerts for new jobs by TensorWave
Set alerts for new Network Engineering jobs in United States
Set alerts for new jobs in United States
Set alerts for Network Engineering (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙