Applied Machine Learning Backend Engineer (L5), Monitoring & Alerting

1 Month ago • 6 Years + • Backend Development • Research & Development • $100,000 PA - $720,000 PA

Job Summary

Job Description

The Applied Machine Learning Backend Engineer (L5) will be part of Netflix's Monitoring and Alerting team, focusing on enhancing application resilience using AI Ops. Responsibilities include defining and implementing strategies for AI Ops platform integration with ML algorithms and Generative AI; building systems integrating observability data with LLMs for improved issue diagnosis; contributing to a platform processing billions of data points; and mentoring junior engineers. The role requires expertise in backend service development, ML at scale, distributed computing frameworks (Spark, Kafka), Java/Python, and API design and maintenance. A product mindset and strong communication skills are essential.
Must have:
  • 6+ years experience in high-scale companies
  • Applied ML engineer with backend experience
  • Proficient in Java/Python and distributed computing
  • API design, building, and maintenance expertise
  • Strong communication and collaboration skills
Good to have:
  • Experience with observability products (logs, metrics, traces)
  • Familiarity with ML algorithms for fault detection (anomaly detection, time series analysis)
  • Experience with Generative AI
Perks:
  • Comprehensive benefits including health plans, mental health support, 401k, stock options
  • Paid time off, flexible time off (for salaried employees)
  • Family-forming benefits

Job Details

Netflix is one of the world's leading entertainment services, with 283 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.

The Observability Team provides a platform and suite of products that enable Netflix engineers to monitor service behavior in real time, detect system health anomalies, and troubleshoot issues. This role is part of the Monitoring and Alerting team, a crucial part of our Observability engineering group.
 

As part of the team, you will: 

  •  Work across a suite of observability products including System Health Experience, Alert configuration and management tools and Issue Management Experience.

  • Define and implement a forward-looking strategy for Netflix's AI Ops platform, focusing on the integration of Machine Learning algorithms and cutting-edge Generative AI techniques to enhance application resilience in conjunction with observability platforms. 

  • Build Systems that integrate internal Observability data with LLMs for more effective guidance when diagnosing issues.

  • Contribute to a platform that processes billions of data points in real time every minute

  • Provide mentorship to early career engineers on the team, fostering their growth and development. 

About You:

  • 6+ years of experience in high-scale, modern tech-driven companies.

  • You consider yourself an Applied Machine Learning engineer and have experience in both developing backend and applying Machine Learning algorithms at scale for a good part of your career. You act like an owner and have expertise in building and maintaining backend services

  • You are proficient in designing, building, and operating algorithms or leveraging APIs for Monitoring and Alerting applications. You are familiar with the latest techniques and applications of Generative AI.

  • You are proficient in distributed computing frameworks that enable efficient handling, analysis, and processing of large datasets, such as Apache Spark for fast, in-memory data processing, and Apache Kafka for real-time data streaming.

  • You have built, maintained, evolved, and/or sunset a variety of APIs. You can confidently describe the API’s schema, purpose, response types, and up/down-stream services.

  • You are proficient in working with Java (or JVM language) and Python.

  • You are knowledgeable about and are willing to own all areas of the software lifecycle: design, development, test, deploy, operate, and support.

  • You possess exceptional communication and collaboration skills, effectively engaging with a diverse, cross-functional team of stunning colleagues. Capable of working independently, yet excel in a dynamic team environment.

  • Have a product mindset that is deeply empathetic to customer needs, strategic in orientation, and metrics and outcomes-driven.

Nice To Have

  • You have used or built observability products involving logs, metrics, and traces. 

  • Familiar with ML algorithms for detection and classification of system faults, such as Real-time anomaly detection, Outlier detection, time series analysis and forecasting, etc.

About Netflix

Our culture is unique, and we live by our values, so it’s worth learning more about .

Our compensation structure consists solely of an annual salary; we do not have bonuses. You choose each year how much of your compensation you want in salary versus stock options. To determine your personal top of market compensation, we rely on market indicators and consider your specific job family, background, skills, and experience to determine your compensation in the market range. The range for this role is $100,000 - $720,000.

Netflix provides comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family-forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs.  Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. Full-time salaried employees are immediately entitled to flexible time off. See more detail about our Benefits.

is a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want an accommodation/adjustment for a disability or any other reason during the hiring process, please send a request to your recruiting partner.

We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

Job is open for no less than 7 days and will be removed when the position is filled.

Similar Jobs

ION - Principal Software Engineer, Italy

ION

Milan, Lombardy, Italy (On-Site)
5 Months ago
ByteDance - Software Engineer Intern (CDN/Edge/Traffic Platform)

ByteDance

San Jose, California, United States (On-Site)
6 Days ago
Scopely - Lead Engineer

Scopely

Mexico City, Mexico City, Mexico (On-Site)
1 Month ago
Imply - Senior Software Development Engineer in Test

Imply

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Meta - Software Engineer, Product

Meta

Seattle, Washington, United States (Remote)
4 Months ago
Warner Bros Games - Senior Manager - Data Platform Team

Warner Bros Games

Hyderabad, Telangana, India (Hybrid)
1 Month ago
Krafton  - [AI] Deep Learning Service Dev - Backend Engineer (3년 이상)

Krafton

Seoul, South Korea (On-Site)
4 Months ago
DraftKings - Software Engineer - Backend

DraftKings

Dublin, County Dublin, Ireland (On-Site)
1 Week ago
Playrix - Senior Golang Developer

Playrix

Ireland (Remote)
2 Months ago
Onward Search - Java Developer III

Onward Search

New York, New York, United States (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Nagarro - Senior Engineer, Hybris

Nagarro

Sri Lanka (Remote)
5 Months ago
ByteDance - Software Engineer Intern (CDN/Edge/Traffic Platform)

ByteDance

Seattle, Washington, United States (On-Site)
1 Week ago
Wargaming - Lead Backend Engineer (Steel Hunters)

Wargaming

Guildford, England, United Kingdom (Hybrid)
1 Week ago
ByteDance - Research Scientist Intern (Traffic Infrastructure Global Engineering)

ByteDance

Seattle, Washington, United States (On-Site)
1 Week ago
Spyke Games - Backend Game Developer

Spyke Games

İstanbul, Türkiye (On-Site)
8 Months ago
Luxoft - Business Analyst - ION

Luxoft

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Playrix - Director of Engineering

Playrix

Montenegro (Remote)
5 Months ago
Genies - 2025 Summer Backend Engineer Intern

Genies

San Mateo, California, United States (On-Site)
3 Weeks ago
Game District - Game Developer

Game District

Punjab, Pakistan (On-Site)
2 Weeks ago
ByteDance - Software Engineer Intern (CDN/Edge/Traffic Platform)

ByteDance

Seattle, Washington, United States (On-Site)
6 Days ago

Get notifed when new similar jobs are uploaded

Jobs in United States

Canva - Strategic Partnership Manager

Canva

Austin, Texas, United States (Remote)
2 Weeks ago
Tencent - Senior Technical Artist

Tencent

Irvine, California, United States (On-Site)
3 Months ago
Axinous - Account Executive, Majors - Cleveland

Axinous

Ohio, United States (Remote)
2 Weeks ago
Visual Concepts - Senior Server Engineer, NBA 2K

Visual Concepts

Novato, California, United States (On-Site)
4 Months ago
Rapt Studio - Senior Designer (Interior Design/Architecture)

Rapt Studio

Los Angeles, California, United States (Hybrid)
5 Months ago
ZeniMax Media - Financial Analyst

ZeniMax Media

Rockville, Maryland, United States (On-Site)
6 Months ago
Ajmera Infotech - ASP.NET Developer with Azure Expertise

Ajmera Infotech

San Jose, California, United States (On-Site)
6 Months ago
Axinous - Account Executive, Majors - Wisconsin

Axinous

Wisconsin, United States (Remote)
2 Weeks ago
Mashgin - Senior Technical Product Manager

Mashgin

Palo Alto, California, United States (Hybrid)
5 Months ago
Life church - Life.Church Campus Internship

Life church

United States (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

Backend Development Jobs

Canva - Software Engineer (Java), Brand Application - Teams & Education

Canva

Sydney, New South Wales, Australia (Remote)
5 Days ago
Sporty Group - Technical Director

Sporty Group

(Remote)
4 Months ago
Google - Software Engineering Manager (For Women in Tech Candidates)

Google

São Paulo, State Of São Paulo, Brazil (On-Site)
4 Months ago
Ludeo - Senior Full Stack Developer

Ludeo

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Month ago
Next Level Business Services - Java - Scala Architect

Next Level Business Services

San Diego, California, United States (On-Site)
5 Months ago
Epic Games - Engineering Lead

Epic Games

(On-Site)
2 Months ago
Truecaller - Senior Fullstack Engineer

Truecaller

Stockholm, Stockholm County, Sweden (On-Site)
4 Months ago
Warner Bros Games - Staff Software Engineer - Backend (Adtech Team)

Warner Bros Games

Pune, Maharashtra, India (Hybrid)
1 Month ago
Mashgin - Senior Software Engineer, Infrastructure

Mashgin

Palo Alto, California, United States (Hybrid)
5 Months ago
Netflix - Distributed Systems Engineer (L5) - Compute Abstractions

Netflix

United States (Remote)
3 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Netflix is one of the world's leading entertainment services with over 247 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.

Berlin, Berlin, Germany (On-Site)

Mexico City, Mexico City, Mexico (On-Site)

Singapore, Singapore (On-Site)

United States (Remote)

Mexico City, Mexico City, Mexico (On-Site)

Los Gatos, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Netflix

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug