Engineering Manager, ML Storage

11 Minutes ago • All levels • Research Development • $240,000 PA - $300,000 PA

Job Summary

Job Description

Zoox is seeking an experienced Software Engineering Manager to lead its High Performance Computing Storage infrastructure team. This role involves managing petabyte-scale data movement and management for critical high-throughput use cases like ML foundation model training and synthetic data generation. Responsibilities include distributed system design, optimizing storage-related GPU utilization, and cost-effective resource management. The manager will also be responsible for hiring, team health, and career development, with a high degree of independence to define Zoox’s scaling strategy.
Must have:
  • Design and implement improvements to Zoox’s in-house, cutting-edge HPC Storage infrastructure
  • Investigate new data movement and management paradigms to meet Zoox’s ever-growing computational and storage needs in a cost-effective manner
  • Create production-grade web service APIs, SDKs, and other tools to provide a world-class developer experience for all of Zoox’s software teams
  • Mentor, coach, and advocate for your direct reports
  • Expand the team by participating in all stages of the hiring process
  • Plan work streams and coordinate projects with customer teams
  • Align goals and progress with Zoox milestones
  • Manage teams of 5-10+
  • Prioritize development work and build cross-functional consensus across ML stakeholders
  • Experience with high performance storage systems deployed on cloud providers, such as FSx for Lustre on Amazon Web Services (AWS)
  • Strong operational background with highly available systems
  • Bachelor's degree in computer science (or related field)
Good to have:
  • Experience with ML-specific data formats such as Mosaic Streaming Datasets (MDS)
  • Experience with end-to-end hosted ML services such as AWS SageMaker HyperPod
  • Proficiency with Python, Java, or other managed languages
Perks:
  • Amazon Restricted Stock Units (RSUs)
  • Zoox Stock Appreciation Rights
  • Sign-on bonus (may be offered)
  • Paid time off (e.g., sick leave, vacation, bereavement)
  • Unpaid time off
  • Health insurance
  • Long-term care insurance
  • Long-term and short-term disability insurance
  • Life insurance

Job Details

Zoox is looking for an experienced Software Engineering Manager to lead our High Performance Computing Storage infrastructure team. Zoox HPC Storage owns the abstraction layers for petabyte+ scale data movement and management for critical high-throughput use cases, such as ML foundation model training, synthetic data generation, and more. You will take on a breadth of end-to-end responsibilities including distributed system design, optimization of storage-related GPU utilization bottlenecks, and cost-effective resource management.

The position comes with a high degree of independence and the opportunity to help define Zoox’s scaling strategy, both technically and organizationally. You will be responsible for hiring and maintaining the health of your team, as well as growing and coaching teammates to support the continued success of their careers.

In this role, you will:

  • Design and implement improvements to Zoox’s in-house, cutting-edge HPC Storage infrastructure
  • Investigate new data movement and management paradigms to meet Zoox’s ever-growing computational and storage needs in a cost-effective manner
  • Create production-grade web service APIs, SDKs, and other tools to provide a world-class developer experience for all of Zoox’s software teams
  • Mentor, coach, and advocate for your direct reports
  • Expand the team by participating in all stages of the hiring process
  • Plan work streams and coordinate projects with customer teams
  • Align goals and progress with Zoox milestones

Qualifications:

  • Experience managing teams of 5-10+
  • Demonstrated ability to prioritize development work and build cross-functional consensus across ML stakeholders
  • Experience with high performance storage systems deployed on cloud providers, such as FSx for Lustre on Amazon Web Services (AWS)
  • Strong operational background with highly available systems
  • Bachelor's degree in computer science (or related field)

Bonus Qualifications:

  • Experience with ML-specific data formats such as Mosaic Streaming Datasets (MDS)
  • Experience with end-to-end hosted ML services such as AWS SageMaker HyperPod
  • Proficiency with Python, Java, or other managed languages

Compensation

There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. The salary range for this position is $240,000 to $300,000. A sign-on bonus may be offered as part of the compensation package. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position. Zoox also offers a comprehensive package of benefits, including paid time off (e.g., sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

Vaccine Mandate

Employees working in this position will be required to have received a vaccine approved by the U.S. Food and Drug Administration and/or the World Health Organization. In addition, employees who are eligible for a COVID-19 booster vaccine (“Booster”) will be required to receive a Booster. Employees will be required to show proof of vaccination status upon receipt of a conditional offer of employment. That offer of employment will be conditioned upon, among other things, an Applicant’s ability to show proof of vaccination status. Please note the Company provides reasonable accommodations in accordance with applicable state, federal, and local laws.

About Zoox

Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We’re looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.

Accommodations

If you need an accommodation to participate in the application or interview process please reach out to accommodations@zoox.com or your assigned recruiter.

A Final Note:

You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Foster City, California, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Research Development Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Zoox is transforming mobility-as-a-service by developing a fully autonomous, purpose-built fleet designed for AI to drive and humans to enjoy.

Foster City, California, United States (Hybrid)

Foster City, California, United States (Hybrid)

Foster City, California, United States (Hybrid)

Foster City, California, United States (Hybrid)

Foster City, California, United States (On-Site)

Foster City, California, United States (Hybrid)

Foster City, California, United States (Hybrid)

Foster City, California, United States (Hybrid)

Foster City, California, United States (On-Site)

Seattle, Washington, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by zoox

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug