Network Operations Engineer

TensorWave

Job Summary

Tensorwave Cloud's mission is to build seamless, secure, reliable, and resilient AI infrastructure at scale, eliminating barriers and challenging the status quo to empower builders and support AI innovation. We are seeking Network Operations Engineers responsible for the day-to-day operation, maintenance, and on-call support of large-scale data center networks supporting AI and GPU workloads.

Must Have

  • Operate and maintain large Ethernet fabrics
  • Participate in on-call rotation and incident response
  • Execute maintenance, upgrades, and hardware replacements
  • Troubleshoot latency, packet loss, and connectivity issues
  • Support continuous scale-out growth of production networks
  • 5+ years data center network operations experience
  • Hands-on experience with large Ethernet fabrics and edge networks
  • Strong understanding of scale-up vs scale-out architectures
  • Experience operating production networks at scale
  • Multi-Vendor Experience (Juniper, Cisco, Arista, Whitebox)
  • NOS Experience (Junos, IOS/IOS-XE, NX-OS, EOS, SONiC)
  • Automation or scripting experience in Python, GO, Bash, or equivalent

Good to Have

  • Experience in 100G+ environments
  • Exposure to AI, GPU, or HPC

Perks & Benefits

  • Mission driven company
  • Competitive Salary
  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance
  • Life and Voluntary Supplemental Insurance
  • Short Term Disability Insurance
  • Flexible Spending Account
  • 401(k)
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Mental Health Benefits through Spring Health

Job Description

Our mission at Tensorwave Cloud is to build seamless, secure, reliable, and resilient AI infrastructure at scale, eliminating barriers and challenging the status quo to empower builders and support AI innovation.

About the Role

We are seeking Network Operations Engineers responsible for the day-to-day operation, maintenance, and on-call support of large-scale data center networks supporting AI and GPU workloads.

Core Responsibilities

  • Operate and maintain large Ethernet fabrics
  • Participate in on-call rotation and incident response
  • Execute maintenance, upgrades, and hardware replacements
  • Troubleshoot latency, packet loss, and connectivity issues
  • Support continuous scale-out growth of production networks

Required Experience

  • 5+ years data center network operations experience
  • Hands-on experience with large Ethernet fabrics and edge networks
  • Strong understanding of scale-up vs scale-out architectures
  • Experience operating production networks at scale
  • Multi-Vendor Experience (Juniper, Cisco, Arista, Whitebox)
  • NOS Experience (Junos, IOS/IOS-XE, NX-OS, EOS, SONiC)
  • Automation or scripting experience in Python, GO, Bash, or equivalent

Preferred Experience

  • 100G+ environments
  • AI, GPU, or HPC exposure

We’re looking for resilient, adaptable people to join our team, people who believe in the mission and think at massive scale. The solutions that worked on a handful of devices will not work at Exascale. Be prepared to be pushed daily, to learn a lot, and literally build the future.

What We Bring

  • Mission driven company
  • Competitive Salary
  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance
  • Life and Voluntary Supplemental Insurance
  • Short Term Disability Insurance
  • Flexible Spending Account
  • 401(k)
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Mental Health Benefits through Spring Health

4 Skills Required For This Role

Game Texts Incident Response Python Bash

Similar Jobs