Software Engineer L4/L5, Training Platform, Machine Learning Platform

7 Months ago • All levels • Artificial Intelligence • $100,000 PA - $720,000 PA

Job Summary

Job Description

Netflix seeks a Software Engineer to build and operate a large-scale ML training platform. Must have experience in ML engineering on production systems, building and operating large-scale infrastructure for ML use cases, and cloud computing. Expertise in deep learning models and distributed training is preferred.
Must have:
  • ML engineering
  • Production systems
  • Large-scale infrastructure
  • Cloud computing
Good to have:
  • SageMaker
  • Distributed training
  • Generative AI
  • Foundation models
Perks:
  • Stock options
  • Flexible time off

Job Details

Netflix is one of the world’s leading entertainment services with 278 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.

The Role

Machine Learning/Artificial Intelligence powers innovation in all areas of the business, from helping members choose the right title for them through personalization, to better understanding our audience and our content slate, to optimizing our payment processing and other revenue-focused initiatives. Building highly scalable and differentiated ML infrastructure is key to accelerating this innovation.

The Opportunity

We are looking for a driven Software Engineer to join the Training Platform team under our Machine Learning Platform (MLP) org. MLP’s charter is to maximize the business impact of all ML use cases at Netflix through highly reliable and flexible ML tooling and infrastructure that supports key product functions such as personalized recommendations, studio algorithms, virtual productions, growth intelligence, and content demand modeling among others.

In this role you will get to: 

  • Design and build the platform that powers large-scale machine learning model training, fine-tuning, model transformation and evaluations workflows and use cases from the entire company

  • Co-design and optimize the systems and models to scale up and increase the cost-effectiveness of machine learning model training

  • Design easy-to-use APIs and interfaces for experienced ML practitioners, as well as non-experts to easy access the training platform

Minimum Job Qualifications

  • Experience in ML engineering on production systems dealing with training or inference of deep learning models.

  • Proven track record of building and operating large-scale infrastructure for machine learning use cases

  • Experience with cloud computing providers, preferably AWS

  • Comfortable with ambiguity and working across multiple layers of the tech stack to execute on both 0-to-1 and 1-to-100 projects

  • Adopt and promote best practices in operations, including observability, logging, reporting, and on-call processes to ensure engineering excellence.

  • Excellent written and verbal communication skills

  • Comfortable working in a team with peers and partners distributed across (US) geographies & time zones.

Preferred Qualifications

  • Understand modern and real-world Machine Learning model development workflows and experience partnering closely with ML modeling engineers

  • Familiarity with cloud-based AI/ML services (e.g., SageMaker, Bedrock, Databricks, OpenAI, etc.)

  • Experience with large-scale distributed training and different parallelism techniques for scaling up training, such as FSDP and tensor/pipeline parallelism

  • Expertise in the area of Generative AI, specifically when it comes to training foundation models, fine tuning them, and distilling them to smaller models

What do we offer?

Netflix's culture is an integral part of our success, and we approach diversity and inclusion seriously and thoughtfully. We are an equal opportunity employer and celebrate diversity, recognizing that bringing together different perspectives and backgrounds helps build stronger teams. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Our compensation structure consists solely of an annual salary; we do not have bonuses. You choose each year how much of your compensation you want in salary versus stock options. To determine your personal top-of-market compensation, we rely on market indicators and consider your specific job family, background, skills, and experience to determine your compensation in the market range. The range for this role is $100,000 - $720,000.

Netflix provides comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family-forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs.  Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. Full-time salaried employees are immediately entitled to flexible time off. See more details about our Benefits .

Netflix has a unique culture and environment.  Learn more .  


 

We are an equal-opportunity employer and celebrate diversity, recognizing that diversity of thought and background builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

Job is open for no less than 7 days and will be removed when the position is filled.

Similar Jobs

Microsoft - Researcher

Microsoft

Beijing, Beijing, China (On-Site)
1 Month ago
ByteDance - Research Scientist, Foundation Model, Speech Understanding

ByteDance

San Jose, California, United States (On-Site)
7 Months ago
Google - Senior Software Engineer, Security and Privacy, Pixel Phone

Google

New Taipei, New Taipei City, Taiwan (On-Site)
1 Month ago
Google - Senior ML Systems Engineer, AICore

Google

Taipei City, Taiwan (On-Site)
1 Month ago
Opendoor - Creative Marketing Manager, Organic Social Media

Opendoor

San Francisco, California, United States (Hybrid)
1 Month ago
NVIDIA - Senior Software Engineer - Automated Parallel Programming

NVIDIA

Santa Clara, California, United States (Remote)
4 Months ago
Google - Senior Risk and Compliance Lead, AI and Content

Google

Washington, District Of Columbia, United States (On-Site)
1 Month ago
Tencent - Machine Learning Platform Development Intern

Tencent

(On-Site)
3 Months ago
ByteDance - Research Scientist in Foundation Model, Music Core Machine Learning Graduates - 2024 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
6 Months ago
Inkittt - Senior Machine Learning Engineer, Recommendations

Inkittt

San Francisco, California, United States (Hybrid)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Salesforce - Database Systems Development - Senior/Lead/Principal Member Technical Staff

Salesforce

Hyderabad, Telangana, India (On-Site)
7 Months ago
The Walt Disney Company - Manager, Software Engineer - Video Playback

The Walt Disney Company

New York, New York, United States (On-Site)
4 Months ago
Google - Software Engineer III, Site Reliability Engineering, Google Cloud

Google

San Francisco, California, United States (On-Site)
1 Month ago
Madison Logic - Sr. Backend Engineer

Madison Logic

Pune, Maharashtra, India (On-Site)
11 Months ago
NVIDIA - Senior Developer Technology Engineer, High-Performance Databases

NVIDIA

New York, New York, United States (Hybrid)
3 Months ago
ByteDance - Student Researcher (Doubao (Seed) - LLM Foundation Research)

ByteDance

San Jose, California, United States (Hybrid)
1 Month ago
ByteDance - Research Scientist in ML Systems

ByteDance

San Jose, California, United States (On-Site)
7 Months ago
Google - Site Reliability Engineer

Google

Dublin, County Dublin, Ireland (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in United States

Sawhorse Productions - Motion Graphics Artist

Sawhorse Productions

California, United States (On-Site)
1 Month ago
Christ Fellowship - Kids Ministry Assistant (Temporary)

Christ Fellowship

Royal Palm Beach, Florida, United States (On-Site)
3 Weeks ago
ION - Services Director

ION

New York, New York, United States (On-Site)
7 Months ago
ByteDance - Software Engineer Intern (Cloud Native Infrastructure)

ByteDance

San Jose, California, United States (On-Site)
2 Months ago
Hawk Eye Innovations - College Sports Systems Technician

Hawk Eye Innovations

Wisconsin, United States (On-Site)
2 Months ago
Univision - Senior Director, Global Distribution Strategy & Risk Management

Univision

Miami, Florida, United States (On-Site)
3 Weeks ago
Scale AI - Solutions Engineer

Scale AI

San Francisco, California, United States (On-Site)
1 Month ago
Tencent - Senior Finance Manager (Finance Business Partner)

Tencent

Los Angeles, California, United States (On-Site)
2 Months ago
Next Level Business Services - Javascript Developer/ Web UI Developer

Next Level Business Services

Santa Clara, California, United States (On-Site)
7 Months ago
Flow - Senior/Staff Backend Software Engineer

Flow

New York, New York, United States (Hybrid)
7 Months ago

Get notifed when new similar jobs are uploaded

Artificial Intelligence Jobs

Google - Customer Engineer, Applied and Generative AI

Google

Jakarta, Jakarta, Indonesia (On-Site)
1 Month ago
Google - Software Engineer II, Applied AI

Google

Kraków, Lesser Poland Voivodeship, Poland (On-Site)
1 Month ago
Spellbrush - AI Anime Researcher

Spellbrush

Tokyo, Japan (On-Site)
2 Months ago
ByteDance - Software Engineer (Applied Machine Learning - Enterprise)

ByteDance

San Jose, California, United States (On-Site)
2 Months ago
Microsoft - Senior Researcher – Generative AI

Microsoft

Redmond, Washington, United States (On-Site)
1 Month ago
Google - Research Scientist, Google Cloud AI

Google

Sunnyvale, California, United States (On-Site)
1 Month ago
Trend Micro - Sr. AI Engineer

Trend Micro

Taipei City, Taiwan (On-Site)
8 Months ago
Inworld AI - Forward Deployed Engineer (AI Gameplay Engineer)

Inworld AI

Vancouver, British Columbia, Canada (On-Site)
2 Months ago
NVIDIA - Senior Prediction and Planning Machine Learning Engineer - Autonomous Vehicles

NVIDIA

Santa Clara, California, United States (On-Site)
1 Month ago
Zoox - Senior Software Engineer - High Performance Computing

Zoox

Foster City, California, United States (Hybrid)
7 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Paris, Île-de-France, France (On-Site)

Seoul, South Korea (On-Site)

Bogota, Colombia (On-Site)

Singapore, Singapore (On-Site)

Los Angeles, California, United States (On-Site)

Los Angeles, California, United States (On-Site)

Seoul, South Korea (On-Site)

Los Gatos, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Netflix

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug