Engineering Manager, Offline Inference, Machine Learning Platform

3 Weeks ago • 10 Years + • Artificial Intelligence • $190,000 PA - $920,000 PA

Job Summary

Job Description

Lead the development of Netflix's next-generation offline inference platform. This role involves partnering with various teams to understand their needs, architecting and designing the platform, building a roadmap for incremental delivery, and managing a team of engineers. Responsibilities include defining success metrics, driving platform adoption, maintaining existing systems, and hiring and growing a high-performing team. The platform will support large-scale ML models across different domains, including LLMs and computer vision. This requires a strong ML infrastructure background and experience building scalable, robust systems.
Must have:
  • 10+ years software engineering, 3+ years management
  • Experience with high-traffic distributed systems and ML infrastructure
  • Containerization and orchestration expertise
  • Understanding of ML frameworks (PyTorch, SageMaker, etc.)
  • Strong technical acumen and mentorship skills
  • Excellent communication and collaboration skills
  • Ability to develop and execute a technical vision

Job Details

Netflix is one of the world's leading entertainment services, with 283 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.

Machine Learning powers innovation in all areas of the business, including helping members choose the right title for them through personalization, better understanding our audience and our content slate, creating high-quality subtitles, dubbings, images, trailers, and other assets, optimize our payment processing, and much more. The Machine Learning Platform (MLP) organization builds highly scalable and differentiated ML infrastructure to maximize the business impact of all ML practitioners at Netflix, which is the key to accelerating this innovation.

The Opportunity

The Offline Inference team builds and maintains the infrastructure that enables ML practitioners to run their large-scale offline batch inference workflows to generate and store model predictions. We specialize in handling user inferencing submissions that rely on pre-specified static data inputs and machine learning models, packaging them into prediction jobs that can take anywhere from minutes to multiple days to complete, and storing and serving the results.

We are looking for an experienced ML/AI infrastructure engineering leader to lead the development of our next-generation offline inference platform! You will lead this newly formed team to architect, design, develop, test, and launch a brand-new platform to enable ML practitioners across the content, studio, consumer, ads, and games domains to effortlessly package, deploy, and execute inference workflows for thousands of large-scale models, including Large Language Models (LLMs), computer vision and foundation models. The models will come from various lifecycle stages, including early research and experimentation, development, productization, and ongoing innovation and optimization of productized models. 

We are a highly collaborative team. You will be highly cross-functional in partnering with other engineering, product management, machine learning, and data teams to take Netflix’s ML/AI initiatives to the next level. To succeed in this role, you will need a strong ML infrastructure background and a passion for building scalable, robust systems that enable and accelerate our ability to apply large and complex ML models across various domains. 

In this role, you will:

  • Partner with Applied Research and backend application teams to understand their needs and gather and iterate on the platform's requirements. 

  • Drive the architecture, design, build vs. buy evaluation, and execution of the offline inference platform. Strive for extensibility and ensure the platform can scale to meet the needs of the evolving ML/AI landscape.

  • Build a roadmap focused on incremental delivery. Define success metrics, align on migration goals, and drive the platform's adoption. 

  • Communicate progress to stakeholders, customers, and senior leadership. 

  • Maintain existing platform offerings, balancing immediate needs in current systems while prioritizing the development of the next-gen platform. 

  • Hire and grow diverse, highly talented engineers while maintaining and fostering an inclusive team culture.

To succeed in this role, you will need:

  • 10+ years of software engineering experience and 3+ years of management experience. 

  • Experience leading teams responsible for building high-traffic distributed systems and ML infrastructure

  • Experience with containerization and orchestration technologies to support data preparation, processing, and inference for large-scale ML models.

  • A proven understanding of ML frameworks and commercial ML/AI infrastructure, such as PyTorch, SageMaker, Ray Serve, and HuggingFace.

  • Strong technical acumen and can act as a credible technical advisor to the team, set and enforce a high-quality bar for code and system design, and be a mentor for the team.

  • A passion for translating the needs of ML practitioners into infrastructure offerings with an eye toward automated and self-serve capabilities.

  • Strong communication and collaboration skills and ability to build strong relationships with internal customers and external partners. 

  • A demonstrated ability to develop, drive, and execute a technical vision and roadmap.

  • A track record of attracting top talent and leading and growing high-performance, diverse, and highly talented tenured engineers deep into their careers to maximize their impact and deliver results in a fast-paced and dynamic environment.

  • Experience managing a hybrid team with partners and team members distributed across (US) geographies & time zones.

To learn more about our ML Platform, you can review the relevant talks/blog posts on the .

At Netflix, we carefully consider various compensation factors to determine your personal top of market. We rely on market indicators to determine compensation and consider your specific job, skills, and experience to get it right. These considerations can cause your compensation to vary and will also depend on your location. 

The overall market range for roles in this area of Netflix is typically $190,000 - $920,000. 

This market range is based on total compensation (vs. only base salary), which is in line with our compensation philosophy. Netflix has a unique culture and environment. Learn more

is a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want an accommodation/adjustment for a disability or any other reason during the hiring process, please send a request to your recruiting partner.

We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

Job is open for no less than 7 days and will be removed when the position is filled.

Similar Jobs

ByteDance - Research Scientist, Foundation Model, Speech Understanding

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
Hedra - Lead Product Engineer

Hedra

San Francisco, California, United States (On-Site)
4 Months ago
Microsoft - Research Intern - Applied Sciences Group (Computer Vision)

Microsoft

Redmond, Washington, United States (On-Site)
1 Month ago
GoTo Group - Lead Data Scientist - KYC

GoTo Group

Singapore (On-Site)
1 Month ago
Blockville Digital Assets - AI Technology Specialist for Game Development

Blockville Digital Assets

İstanbul, Türkiye (On-Site)
6 Months ago
Canva - Senior Computer Vision Engineer - Photo AI

Canva

London, England, United Kingdom (Remote)
1 Week ago
Meta - Software Engineer, Systems ML - SW/HW Co-design

Meta

Menlo Park, California, United States (Remote)
3 Months ago
Google - Software Engineer III, Computer Vision, Pixel Camera

Google

New Taipei, New Taipei City, Taiwan (On-Site)
1 Month ago
ByteDance - Edge AI Software Engineer

ByteDance

San Jose, California, United States (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Canva - GenAI Research Engineering Manager - Image Generation (m/f/x) - Canva Austria

Canva

Vienna, Vienna, Austria (Remote)
2 Months ago
The Artarium - AI Digital Artist

The Artarium

Gurugram, Haryana, India (On-Site)
6 Months ago
Glance - Machine Learning Engineer III

Glance

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Mashgin - Senior Software Engineer, Full-Stack

Mashgin

Palo Alto, California, United States (Hybrid)
3 Months ago
Meta - Software Engineer, Computer Vision (Technical Leadership)

Meta

New York, New York, United States (Remote)
2 Months ago
ByteDance - Partnership Development Specialist - LLM Data Acquisition and Production

ByteDance

Los Angeles, California, United States (On-Site)
3 Weeks ago
ByteDance - Research Scientist, Foundation Model, Speech & Audio

ByteDance

Seattle, Washington, United States (On-Site)
3 Months ago
Hitachi - Data Science

Hitachi

Pune, Maharashtra, India (On-Site)
3 Months ago
Intel Corporation - Full Stack Software Developer & Machine Learning Engineer

Intel Corporation

San José, San José Province, Costa Rica (Hybrid)
2 Months ago
Google - Research Intern, PhD, Summer 2025

Google

Atlanta, Georgia, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in United States

CloudHire - US Sales Consultant (Base + Commission)

CloudHire

Charlotte, North Carolina, United States (Hybrid)
3 Months ago
Patel greene - STEP Intern

Patel greene

Sarasota, Florida, United States (On-Site)
3 Months ago
Microsoft - Senior Software Engineer

Microsoft

Mountain View, California, United States (Remote)
2 Weeks ago
Mattel  Inc  - Consumer Services Coordinator

Mattel Inc

East Aurora, New York, United States (On-Site)
1 Month ago
Next Level Business Services - Salesforce Technical Lead

Next Level Business Services

Bloomington, Minnesota, United States (On-Site)
3 Months ago
ByteDance - Senior Research Scientist- Foundation Model, Vision and Language

ByteDance

Seattle, Washington, United States (On-Site)
3 Months ago
Jobot - Lead Gameplay Animator

Jobot

Redwood City, California, United States (Remote)
5 Months ago
Thumbtack - Staff Software Engineer,  Machine Learning Infrastructure

Thumbtack

United States (Remote)
2 Months ago
Sleeper - Performance Creative Associate (TikTok)

Sleeper

Seattle, Washington, United States (On-Site)
3 Months ago
The Walt Disney Company - Manager, Software Engineering - Ads Data Infrastructure and Devops

The Walt Disney Company

Santa Monica, California, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

GovGPT - Expert AI Engineer - NLP and Computer Vision (Pubic Safety)

GovGPT

Hyderabad, Telangana, India (On-Site)
6 Months ago
RoofStack - Senior Data Scientist

RoofStack

İstanbul, İstanbul, Türkiye (Hybrid)
4 Weeks ago
GT - ML Engineer

GT

(Remote)
1 Day ago
Ubisoft - Senior ML Ops - Content Creation Technology Group

Ubisoft

Montreal, Quebec, Canada (On-Site)
3 Weeks ago
Blizzard Entertainment - Senior Data Scientist, Computer Graphics

Blizzard Entertainment

Irvine, California, United States (On-Site)
2 Months ago
Seedify - AI Product Manager

Seedify

(Remote)
1 Month ago
Mistplay - Senior Data Scientist II

Mistplay

Montreal, Quebec, Canada (Hybrid)
2 Days ago
Ello - Tech Lead, Machine Learning

Ello

San Francisco, California, United States (On-Site)
3 Months ago
The Walt Disney Company - Manager, Strategic Initiatives

The Walt Disney Company

Burbank, California, United States (On-Site)
1 Day ago

Get notifed when new similar jobs are uploaded

About The Company

Netflix is one of the world's leading entertainment services with over 247 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.

Warsaw, Masovian Voivodeship, Poland (On-Site)

Los Angeles, California, United States (On-Site)

Buenos Aires, Buenos Aires, Argentina (On-Site)

Los Gatos, California, United States (On-Site)

Pennsylvania, United States (On-Site)

United States (Remote)

Amsterdam, North Holland, Netherlands (On-Site)

Los Gatos, California, United States (On-Site)

Manila, Metro Manila, Philippines (On-Site)

View All Jobs

Get notified when new jobs are added by Netflix

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug