HPC Systems Engineer (Network)

15 Minutes ago • All levels
Network Engineering

Job Description

Nscale is seeking a HPC Systems Engineer (Network) to design, deploy, and operate high-speed networking services for their GPU cloud infrastructure, engineered for AI. This role involves managing internet transit, WAN connectivity, and data center networking, acting as a 3rd/4th line escalation point. The engineer will work with deployment teams, troubleshoot performance issues, and automate hardware operations, contributing to cost-effective, high-performance AI infrastructure.
Must Have:
  • Design, deploy, and operate high-speed Ethernet based networks
  • Design, deploy, and operate high-speed InfiniBand networks
  • Work with deployment teams to ensure BOMs are correct and fit for purpose
  • Provide input into DC layout and Rack elevations for correct Reference Architecture implementation
  • Troubleshoot performance issues on both Ethernet and InfiniBand networks
  • Automate deployment and initial standup of multiple vendors' hardware
  • Automate day-to-day operations of multiple vendors' hardware
  • Design, implement, and support WAN infrastructure
  • Mandatory in-datacentre experience in deploying high-quality network infrastructure
  • Hands-on experience installing and configuring network equipment (switches, routers, firewalls)
  • Proven experience with VLAN, LACP, MLAG, BGP, OSPF, EVPN, VXLAN
  • Proven experience with InfiniBand configuration, performance tuning, and troubleshooting (Subnet Managers, QoS, RDMA)
  • Strong knowledge of MPLS, BGP, IPSec, GRE, SD-WAN, and associated routing protocols
  • Knowledge of HPC networking topologies (Fat Tree, Rail)
  • Strong Python or Bash scripting skills
  • Strong Optics and hardware knowledge for BOM design

Add these skills to join the top 1% applicants for this job

communication
problem-solving
team-player
cost-management
game-texts
networking
python
bash

Join Nscale as a HPC Systems Engineer (Network)

Are you passionate about Data Centre builds and large scale GPU infrastructure projects? Do you thrive in a fast-paced, high-growth environment where your work has a direct impact on business outcomes? If so, this could be the role for you!

Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.

At Nscale, our Engineering team plays a critical role in driving the delivery of our GPU infrastructure. If you're passionate about datacenter architecture and thrive in high-performance computing environments then please apply.

Why Nscale?

We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you’ll be contributing to building the technology that powers the future.

About the Role

Network engineers at Nscale are responsible for the design, deployment, and ongoing operation of all networking services that underpin both the internal management platform and the customer-facing cloud infrastructure, this includes internet transit, WAN connectivity and DC networking. You will act as a 3/4th line escalation point for the support organisation.

What You’ll be Doing

  • Designing, deploying and operating high speed Ethernet based networks
  • Designing, deploying and operating high speed Infiniband networks
  • Working with deployment teams to ensure BOMs are correct and fit for purpose
  • Providing input into DC layout and Rack elevations, to ensure our Reference Architectures are implemented correctly
  • Troubleshooting performance issues on both Ethernet and Infiniband networks
  • Automating the deployment and initial standup of multiple vendors hardware
  • Automating the day to operations of multiple vendors hardware
  • Designing, implementing and supporting our WAN infrastructure

About you

  • Experience in deploying high-quality network infrastructure, ensuring efficiency and reliability in data centres - In-datacentre experience is mandatory.
  • Hands-on experience with installing and configuring network equipment, including switches, routers and firewalls.
  • Proven experience in designing, deploying, and operating high-speed Ethernet networks, including technologies such as VLAN, LACP, MLAG, BGP, OSPF, EVPN, and VXLAN.
  • Proven experience in designing, deploying, and operating InfiniBand networking, including configuration, performance tuning, and troubleshooting (e.g., Subnet Managers, QoS, RDMA).
  • Strong knowledge of WAN technologies including MPLS,BGP,IPSec, GRE, SD-WAN, and associated routing protocols.
  • Knowledge of HPC networking topologies. Fat Tree, Rail
  • Strong Python or Bash
  • Strong Optics and hardware knowledge to assist with BOM design

Personal Attributes

  • Proactive and self-motivated, with a strong sense of ownership.
  • Thrives in a fast-paced, dynamic, and high-growth environment.
  • Collaborative team player with a passion for delivering outstanding candidate and stakeholder experiences.
  • Strong attention to detail and documentation skills.
  • Excellent communication skills, both written and verbal.
  • A self-starter mindset with a “see a problem, fix a problem” mentality.

Please Note: This role will require 20-30% travel to our European sites.

In all we do, our core values guide us.

Relentless Innovation

At Nscale, we constantly push the boundaries of innovation, embracing creative risks to shape the future. Our aim is to deliver products that not only meet but exceed today’s expectations, setting new standards for tomorrow.

Ownership and Accountability

Every Nscaler is fully accountable for their work, driving it with excellence and urgency. We set high standards, ensuring that our contributions are not just good but exceptional.

Openness and Transparency

We believe trust and transparency are key to our success. We maintain open communication within our teams and with stakeholders, sharing both successes and challenges. Our open-source approach allows customers to explore our technology, building trust and ensuring our solutions are both innovative, secure, and reliable.

Customer-Centric Focus

Our customers are central to our mission, and we are committed to delivering impactful solutions that drive real-world success. We focus on deeply understanding their needs and challenges, striving to exceed expectations in both product quality and service.

Sustainability

We are dedicated to considering the long-term environmental and societal impacts of our technologies. By integrating sustainability into our operations and product development, we ensure that our innovations are both effective and responsible, contributing positively to the world around us.

Full-Speed Collaboration

Collaboration at Nscale is fast, efficient, and respectful. We work together seamlessly, with clear communication and mutual respect, ensuring our shared goals are met with high standards and impactful outcomes.

Equal Opportunities Statement

We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.

If there’s anything we can do to accommodate your specific situation, please let us know.

The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.

Set alerts for more jobs like HPC Systems Engineer (Network)
Set alerts for new jobs by NSCALE
Set alerts for new Network Engineering jobs in United Kingdom
Set alerts for new jobs in United Kingdom
Set alerts for Network Engineering (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙