Infrastructure Engineer - Compute

NSCALE

Job Summary

Nscale is seeking an Infrastructure Engineer to join their Operational engineering team, responsible for designing, implementing, operating, and continuously improving the infrastructure stack for internal and customer-facing services. This role focuses on OpenStack, storage systems, Proxmox, and critical supporting services like DNS and DHCP. The engineer will ensure high availability, scalability, automation, and security, acting as a 3rd/4th line escalation point and contributing to infrastructure roadmap planning.

Must Have

  • Design, implement, and operate scalable and resilient infrastructure platforms (OpenStack, Proxmox, Ceph)
  • Continuously improve automation for provisioning, monitoring, patching, and recovery
  • Act as a 3rd/4th line escalation point for complex infrastructure issues
  • Expert level Linux systems administration
  • Strong experience with Ansible for infrastructure automation
  • 5+ years scripting in Python or Bash
  • Strong experience with large OpenStack clusters
  • Strong experience with Proxmox
  • Extensive Linux troubleshooting
  • Understanding of datacenter operational best practices

Good to Have

  • Knowledge of Ironic
  • Knowledge of Neutron/OVN/OVS

Perks & Benefits

  • Collaborative, supportive, and innovative environment
  • Highly competitive package (base + equity) with reviews every 12 months
  • Opportunity to join a fast-growing tech startup
  • Dynamic progression plan tailored to ambitions
  • Human-First Flexibility and flexible workplace
  • Remote-first team

Job Description

About Nscale

Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.

At Nscale, our Engineering team plays a critical role in driving the deployment and then subsequent management of our infrastructure and software platforms..

We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you’ll be contributing to building the technology that powers the future.

About the Role (Job Purpose)

Infrastructure Engineers at Nscale sit inside the Operational engineering team. The Operational engineering team is responsible for the design, implementation, operation, and continuous improvement of the infrastructure stack that underpins all internal and customer-facing services. This includes all components below the hypervisor, with a strong focus on OpenStack, storage systems, Proxmox, and critical supporting services such as DNS, DHCP, and infrastructure automation.

This team ensures high levels of availability, scalability, automation, and security for the infrastructure layers they own.

This team acts as a 3/4th line escalation point for support organisations, as well as providing subject matter expertise to pre-sales and other groups within the organisation.

What You'll be Doing (Responsibilities)

  • Designing, implementing, and operating scalable and resilient infrastructure platforms, with a strong focus on OpenStack, Proxmox, Ceph, and supporting critical services such as DNS, DHCP, and configuration management.
  • Continuously improving automation for provisioning, monitoring, patching, and recovery using infrastructure-as-code and configuration management tools.
  • Collaborating with internal teams to ensure infrastructure solutions meet performance, availability, and security requirements.
  • Acting as a 3rd/4th line escalation point for complex infrastructure issues, and working closely with support teams to resolve problems and identify root causes.
  • Contributing to infrastructure roadmap planning, including capacity management, performance tuning, and introducing new technologies.
  • Supporting pre-sales and solution design efforts by providing technical expertise on infrastructure capabilities and best practices.
  • Ensuring all infrastructure platforms adhere to compliance, security, and operational standards.
  • Participating in on-call rotations and incident response activities for critical infrastructure services.

About You (Skills / Quaifications)

  • Expert level experience with Linux systems administration
  • Strong experience of designing and building automation of both physical and virtual infrastructure using tools like Ansible.
  • 5+ years of experience scripting in Python or Bash
  • Strong experience of deploying, managing, upgrading and operating large OpenStack clusters.
  • Strong experience of deploying managing automating Proxmox
  • Extensive troubleshooting experience of linux and services running on linux
  • Understanding of datacenter operational best practices for power, cooling, and high-density compute.
  • Ability to collaborate across global engineering and operations teams.

Nice to have:

  • Knowledge of Ironic
  • Knowledge of Neutron/OVN/OVS

What We Can Offer You

At Nscale, you'll find a collaborative, supportive, and innovative environment where your contributions spark real impact. We're building something extraordinary, and we want you at the core.

  • Highly competitive package (base + equity) with reviews every 12 months. 🚀
  • Join the fastest-growing tech startup, your chance to push boundaries, collaborate with brilliant minds, and make your mark on cutting-edge AI. ✨
  • Expect a dynamic progression plan tailored to your ambitions. Grow by trying new things, leading, challenging the status quo, and owning your impact, always with our full support.
  • Human-First Flexibility: We treat you as humans first. 🫶🏽 Our flexible workplace trusts Nscalers to deliver, giving you the autonomy to shape your day around life's moments.

Join our thriving remote-first team. Geography is no barrier to impact or connection. We build seamless virtual collaboration, empowering you, wherever you work

Equal Opportunities Statement

We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.

If there’s anything we can do to accommodate your specific situation, please let us know.

The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.

12 Skills Required For This Role

Problem Solving Cost Management Game Texts Dns Incident Response Linux Dhcp Ansible Openstack Spark Python Bash

Similar Jobs