Senior Software Engineer, HPC Platform Modernization

6 Months ago • 7 Years + • Devops • $185,000 PA - $252,000 PA

Job Summary

Job Description

Zoox is seeking an experienced Software Engineer to modernize their High-Performance Computing (HPC) infrastructure and its supporting ecosystem. This role involves developing key frameworks and services for Autonomous Vehicle development, utilizing technologies like Ray.io and SLURM. The engineer will be responsible for distributed system design, algorithmic job scheduling, and cloud scaling. The position offers a high degree of independence and the opportunity to shape the company's compute scaling strategy, working with autonomy and software teams to enhance developer experiences.
Must have:
  • 7+ years of experience
  • Experience with Ray.io
  • Experience with Kubernetes
  • Experience with Ray.io/Kubernetes on AWS/Azure/GCP
  • Proficiency in Python
Good to have:
  • Exposure to ML workloads
  • Experience with Kubernetes/SLURM at scale (>10k nodes)
  • Experience with SLURM
Perks:
  • Paid time off
  • Zoox Stock Appreciation Rights
  • Amazon RSUs
  • Health insurance
  • Long-term care insurance
  • Disability insurance
  • Life insurance

Job Details

Zoox is looking for an experienced Software Engineer to work on key new frameworks and infrastructure modernization for our custom High-Performance Computing infrastructure and its supporting ecosystem of tools and services. Zoox HPC services combine industry-best scheduling and workload orchestration technologies, such as Ray.io and SLURM, with value-add workflows specifically for Autonomous Vehicle development. These HPC services form the backbone of development workflows across all Zoox software teams, from data engineering to training our AI models in Perception, Planner, Prediction, to simulation, and more. You will take on a breadth of end-to-end responsibilities including distributed system design, algorithmic job scheduling, and adaptive cloud scaling in support of all of Zoox’s computational needs.

The position comes with a high degree of independence and the opportunity to help define Zoox’s compute scaling strategy, both technically and organizationally. You will work closely with stakeholders in Autonomy and Software teams to iterate on world-class developer experiences, incorporating the latest industry tools and best practices.

In this role, you will:

  • Evaluate new distributed system paradigms and technologies to meet Zoox’s ever-growing computational and storage needs
  • Strike a balance between incremental improvements to Zoox’s existing in-house HPC infrastructure and greenfield services and abstractions.
  • Create production-grade web service APIs, SDKs, and other tools to provide a world-class developer experience for all of Zoox’s software teams.

Qualifications

  • 7+ years of experience
  • Experience with Ray.io, particularly Ray Core and Ray Data
  • Experience with Kubernetes, particularly for heterogeneous workloads and clusters
  • Experience with Ray.io and Kubernetes deployed on Amazon Web Services (AWS) or other similar cloud providers such as Azure or GCP
  • Proficiency with Python

Bonus Qualifications

  • Exposure to machine learning workloads (training, inference, data generation, etc) from a compute infra service provider perspective
  • Experience with Kubernetes or SLURM at scale (>10k+ nodes)
  • Experience with SLURM workload manager

$185,000 - $252,000 a year
Base Salary Range

There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. A sign-on bonus may be offered as part of the compensation package. The listed range applies only to the base salary. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.

Zoox also offers a comprehensive package of benefits, including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

About Zoox
Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We’re looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.

Follow us on LinkedIn

Accommodations
If you need an accommodation to participate in the application or interview process please reach out to accommodations@zoox.com or your assigned recruiter.

A Final Note:
You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.

Similar Jobs

Devoteam - IT Traineeship - DevOps I-Tech (Dutch speaking)

Devoteam

Amsterdam, North Holland, Netherlands (On-Site)
9 Months ago
Ion - Senior Data Engineer, Risk Technology

Ion

New York, United States (On-Site)
6 Months ago
AeroSpike - Solutions Architect

AeroSpike

Mumbai, Maharashtra, India (On-Site)
2 Months ago
Well - Senior Machine Learning Engineer

Well

New York, United States (On-Site)
1 Month ago
Prophecy - Technical Support Manager

Prophecy

Bengaluru, Karnataka, India (Hybrid)
2 Months ago
Riot Games - Staff Software Engineer - VALORANT, Foundations, Build Platforms

Riot Games

Los Angeles, California, United States (On-Site)
1 Month ago
ARHS - Solution Architect NodeJS/Kafka (m/f)

ARHS

Luxembourg (On-Site)
13 Hours ago
bytedance - Site Reliability Engineer, Edge Services

bytedance

Boston, Massachusetts, United States (On-Site)
5 Months ago
Applike - (Senior) DevOps Engineer

Applike

Hamburg, Hamburg, Germany (Hybrid)
3 Years ago
miniclip - Senior Cloud Engineer - Senior Cloud Engineer I

miniclip

Lisbon, Lisbon, Portugal (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

London stock Exchange - Senior Technical Implementation Specialist

London stock Exchange

Gdynia, Pomeranian Voivodeship, Poland (On-Site)
1 Year ago
Ethos Life - Principal Security Engineer

Ethos Life

San Francisco, California, United States (On-Site)
1 Week ago
MiQ - Senior Data Scientist

MiQ

Bengaluru, Karnataka, India (On-Site)
1 Week ago
Super.com - Senior Full-Stack Software Engineer ( Remote! )

Super.com

Boston, Massachusetts, United States (Remote)
9 Months ago
Uniswap Labs - Staff Backend Engineer

Uniswap Labs

New York, United States (Hybrid)
2 Months ago
Bazaar Voice - Machine Learning Engineer

Bazaar Voice

Belfast, Northern Ireland, United Kingdom (Hybrid)
1 Month ago
Addepar - Staff Automation Engineer

Addepar

United States (Remote)
1 Week ago
Ansys - Product Specialist (HFSS, Python, Software Quality)

Ansys

Bengaluru, Karnataka, India (On-Site)
3 Weeks ago
Pay2 - Data Engineer

Pay2

Gurugram, India (On-Site)
2 Days ago
WebTech Corporation - Senior Director, AI & Data Architecture

WebTech Corporation

Bengaluru, Karnataka, India (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Foster City, California, United States

PayPal - Sr Product Designer

PayPal

New York, United States (Hybrid)
1 Month ago
Whatnot - Customer Success Manager

Whatnot

New York, United States (On-Site)
1 Month ago
Apple - Data Platform SRE

Apple

Austin, Texas, United States (On-Site)
1 Month ago
IGN - Director of Subscriptions

IGN

Los Angeles, California, United States (Hybrid)
3 Months ago
WPI - Part Time Building Supervisor

WPI

Worcester, Massachusetts, United States (On-Site)
2 Months ago
Interactive Brokers - Regulatory Correspondence Counsel

Interactive Brokers

Chicago, Illinois, United States (Hybrid)
2 Months ago
Apple - Firmware Engineer - Human Interface Devices

Apple

Cupertino, California, United States (On-Site)
1 Month ago
The New York Times - Administrative Assistant, Marketing and Growth

The New York Times

New York, United States (Hybrid)
1 Month ago
Meta - Research Scientist Intern, Smart Glasses in Wearables AI (PhD)

Meta

Menlo Park, California, United States (On-Site)
8 Months ago
JMA - SAP Supply Chain Analyst

JMA

Syracuse, New York, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Devops Jobs

SparkCognition - Senior IT Cloud Engineer

SparkCognition

Bengaluru, Karnataka, India (On-Site)
10 Months ago
bytedance - Cloud Site Reliability Engineer

bytedance

Seattle, Washington, United States (On-Site)
4 Months ago
Apple - Senior Site Reliability Engineer

Apple

San Diego, California, United States (On-Site)
2 Months ago
Sailpoint - Senior Solutions Engineer

Sailpoint

Dallas, Texas, United States (On-Site)
2 Weeks ago
Roblox - Principal Software Engineer, Open Platform & AI Enablement

Roblox

San Mateo, California, United States (On-Site)
1 Month ago
CyberArk - Senior Software Engineer, Golang, Cloud Native

CyberArk

Santa Clara, California, United States (Hybrid)
2 Months ago
Fortra - Solutions Engineer

Fortra

United Arab Emirates (On-Site)
3 Weeks ago
PwC - ETIC, GCP Cloud Solution Architect - Senior Manager

PwC

Cairo, Cairo Governorate, Egypt (On-Site)
9 Months ago
Granicus - Associate Solution Architect

Granicus

Bengaluru, Karnataka, India (Hybrid)
4 Months ago
Britive - Senior Software Engineer (Cloud)

Britive

Bengaluru, Karnataka, India (Remote)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Zoox is transforming mobility-as-a-service by developing a fully autonomous, purpose-built fleet designed for AI to drive and humans to enjoy.

Foster City, California, United States (Hybrid)

Foster City, California, United States (Hybrid)

Foster City, California, United States (Hybrid)

Foster City, California, United States (Hybrid)

Foster City, California, United States (On-Site)

Foster City, California, United States (Hybrid)

Foster City, California, United States (Hybrid)

Foster City, California, United States (Hybrid)

Foster City, California, United States (On-Site)

Seattle, Washington, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by zoox