CDN Site Reliability Engineer (SRE) L4/L5

4 Months ago • 3 Years + • DevOps • $100,000 PA - $720,000 PA

Job Summary

Job Description

Netflix seeks a Site Reliability Engineer (SRE) to design, scale, operate, automate, and analyze their globally distributed CDN. Responsibilities include improving resiliency, observability, and automation; analyzing large datasets to optimize platform performance and reliability; providing technical support to ISP partners; and handling Tier 3 escalations. The ideal candidate possesses strong *nix, networking, data analysis, and large-scale platform operation skills, with experience in CDNs and HTTP cache/proxy technologies. This role involves working collaboratively with internal and external partners to ensure a high-quality viewing experience for Netflix users worldwide.
Must have:
  • 3+ years SRE experience
  • Strong *nix/Linux skills
  • Networking expertise (TCP/IP, BGP, DNS)
  • Data analysis & automation skills (Python)
  • Experience with large-scale systems
Good to have:
  • Experience with distributed analytic processing (Hive, Presto)
  • Container and orchestration technologies (Docker, Kubernetes)
  • Understanding of applied statistics

Job Details

Netflix is one of the world's leading entertainment services, with 283 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.

How do you spark joy in hundreds of millions of people? It starts with a vision - that technology can give voice to stories around the world. In delivering those much-loved stories, Netflix is responsible for a significant portion of global internet traffic.

To steward that responsibility, we work collaboratively with ISPs to deploy , Netflix’s Content Delivery Network (CDN), our in-house custom-built network and server infrastructure responsible for delivering 100% of Netflix's video traffic. 

We strive to deliver a great Netflix viewing experience in over 190 countries so our customers can watch whatever, whenever, interruption-free. 

We are seeking a Reliability Engineer with extensive experience in *nix, networking, data analysis, and large-scale platform operations experience to design, scale, operate, automate, and analyze our globally distributed CDN.  Come join us and play a meaningful role in our journey to entertain the world!

Responsibilities 

  • Drive continual improvement in resiliency, observability, monitoring, instrumentation, and automation with the primary goal to maintain a highly scalable and reliable CDN platform worldwide.

  • Aggregate, analyze, and correlate large amounts of server and application performance data. Use the innovative Netflix Big Data platform as a highly flexible, specialized and efficient toolset to identify opportunities for platform optimization, system reliability improvements as well as identifying patterns/anomalies for further investigation. 

  • Provide technical design and engineering assistance to ISP partners to integrate our Open Connect Appliances.

  • Handle Tier 3 escalation and participate in an on-call rotation for the CDN platform production issues.

Qualifications

  • 3+ years Service Reliability/Operational experience running large scale, high performance systems & internet services with focus on performance and reliability.

  • Preferred - B.S. in Computer Science, Electrical or Computer Engineering (or equivalent professional experience)

  • Strong working knowledge of networking concepts and application protocols, especially TCP/IP, BGP, DNS, TLS, and HTTP/S with focused experience on CDNs and HTTP cache/proxy technologies.

  • Skilled in designing, creating and maintaining automation written in a programming language such as Python.

  • Expert-level knowledge managing and debugging Unix/Linux systems (engineering fundamentals, networking, storage, operating systems) at scale.

  • Experience with distributed analytic processing technologies (Hive, Presto/Trino, Spark SQL, etc)

  • Strong understanding of applied statistics and the ability to code systems that identify outlier behavior in large systems.

  • Some experience with container and container orchestration technologies (Docker, Kubernetes).

  • Ability to work in a highly collaborative environment and to communicate cross functionally with internal and external partners.

Things that show how we think

Does this sound interesting? Or does this sound interesting but intimidating? Please don’t self-select; let’s figure it out together. We’d love to talk to you!

Netflix is a global company with a diverse member base, which is why the content we produce reflects global perspectives and global stories. As we grow globally, we must have the most talented employees with diverse backgrounds, cultures, perspectives, and experiences to support our innovation and creativity. We are an equal opportunity employer and strive to build balanced teams from all walks of life.

Our culture is unique, and we tend to live by our values, so it’s worth learning more about Netflix .

At Netflix, we carefully consider a wide range of compensation factors to determine your personal top of market. We rely on market indicators to determine compensation and consider your specific job, skills, and experience to get it right. These considerations can cause your compensation to vary and will also be dependent on your location. The overall market range for roles in this area of Netflix is typically  $100,000 - $720,000. This market range is based on total compensation (vs. only base salary), which is in line with our compensation philosophy. 

is a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want an accommodation/adjustment for a disability or any other reason during the hiring process, please send a request to your recruiting partner.

We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

Job is open for no less than 7 days and will be removed when the position is filled.

Similar Jobs

Sandsoft Games - DevOps & Automation Engineer

Sandsoft Games

Barcelona, Catalonia, Spain (Hybrid)
1 Month ago
ByteDance - Senior Site Reliability Engineer, ML System

ByteDance

Seattle, Washington, United States (On-Site)
5 Months ago
Ubisoft - Monitoring Specialist - Golang Developer

Ubisoft

Saint-Mandé, Île-de-France, France (Hybrid)
1 Week ago
GoTo Group - Senior Software Engineer - Data Platform (Mercury)

GoTo Group

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Microsoft - Technical Support Engineer - Storage and High Availability

Microsoft

Bengaluru, Karnataka, India (Hybrid)
1 Week ago
Zazz - Java Developer

Zazz

(Remote)
2 Months ago
Rackspace Technology - Site Reliability Engineer / Observability Engineer

Rackspace Technology

India (Remote)
2 Months ago
Mistplay - Senior DevOps Engineer II

Mistplay

Montreal, Quebec, Canada (Hybrid)
3 Months ago
ARHS - AWS or Azure Cloud Architect

ARHS

Luxembourg (On-Site)
6 Months ago
Toppan Merrill - Site Reliability Engineer

Toppan Merrill

Chennai, Tamil Nadu, India (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Netflix - SRE Manager

Netflix

Warsaw, Masovian Voivodeship, Poland (On-Site)
1 Week ago
ByteDance - Senior Site Reliability Engineer, ML System

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
PwC - Senior Associate_Full Stack Developer_Data & Analytics_Advisory_PAN  India

PwC

Kolkata, West Bengal, India (On-Site)
6 Months ago
Solvative - .NET Developer (ASP.NET & Blazor)

Solvative

Ahmedabad, Gujarat, India (Remote)
3 Weeks ago
Google - Senior Solutions Acceleration Architect, Application

Google

Singapore (On-Site)
1 Week ago
Adtran - Software Engineer

Adtran

Hyderabad, Telangana, India (On-Site)
6 Months ago
PlayStation Global - Senior Linux Network Software Engineer

PlayStation Global

Adelaide, South Australia, Australia (On-Site)
4 Weeks ago
Aristocrat Gaming - DevOps Engineer

Aristocrat Gaming

Warsaw, Masovian Voivodeship, Poland (Hybrid)
1 Month ago
Ubisoft - Full Stack Developer

Ubisoft

Da Nang, Vietnam (On-Site)
6 Days ago
Microsoft - Software Engineer

Microsoft

Ho Chi Minh City, Ho Chi Minh City, Vietnam (On-Site)
5 Days ago

Get notifed when new similar jobs are uploaded

Jobs in California, United States

Insomniac Games - Principal VFX Artist

Insomniac Games

United States (Remote)
1 Month ago
Microsoft - Senior Software Design Engineer

Microsoft

Redmond, Washington, United States (On-Site)
4 Days ago
Google - Software Engineering Manager, Chrome Media

Google

Kirkland, Washington, United States (On-Site)
1 Week ago
Nintendo - Contract - Buyer II

Nintendo

Redmond, Washington, United States (Hybrid)
1 Month ago
Google - Software Engineer III, Google Cloud Security and Privacy

Google

San Francisco, California, United States (On-Site)
5 Months ago
Meta - Product Manager, Machine Learning

Meta

San Francisco, California, United States (Remote)
5 Months ago
Google - Chipset Power Architect

Google

Mountain View, California, United States (On-Site)
1 Week ago
Google - Senior Software Engineer, Generative AI, Google Cloud AI

Google

Sunnyvale, California, United States (On-Site)
1 Week ago
Onward Search - Mobile Games Lighting and VFX Artist

Onward Search

San Francisco, California, United States (Remote)
2 Months ago
Epic Games - Senior SDET, Gameplay

Epic Games

Cary, North Carolina, United States (On-Site)
7 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

NVIDIA - Software Manager, Golang Kubernetes

NVIDIA

Yokne'am Illit, North District, Israel (On-Site)
3 Months ago
Omnissa - Staff Engineer (C++ Windows Internals)

Omnissa

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Brillio - Senior Lead Engineer - R01535121

Brillio

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
NVIDIA - Senior Site Reliability Engineer - GPU Clusters

NVIDIA

Austin, Texas, United States (On-Site)
1 Month ago
Garena - Garena - Data Ops Engineer

Garena

Taipei City, Taiwan (On-Site)
3 Months ago
Google - Customer Engineer III, API and Integration

Google

San Francisco, California, United States (On-Site)
4 Days ago
Info Stretch - Programmer Analyst 5

Info Stretch

Lansing, Michigan, United States (Hybrid)
5 Months ago
CharacterAI - Staff Software Engineer, Site Reliability (SRE)

CharacterAI

Menlo Park, California, United States (On-Site)
4 Weeks ago
Ness Digital - DevOps Engineer

Ness Digital

Timișoara, Timiș, Romania (Hybrid)
3 Months ago
Google - Customer Engineer III, Infrastructure, National Security, Public Sector

Google

Reston, Virginia, United States (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

About The Company

Netflix is one of the world's leading entertainment services with over 247 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.

London, England, United Kingdom (On-Site)

Berlin, Berlin, Germany (On-Site)

Milan, Lombardy, Italy (On-Site)

Paris, Île-de-France, France (On-Site)

Seoul, South Korea (On-Site)

Los Angeles, California, United States (On-Site)

Los Gatos, California, United States (On-Site)

Pennsylvania, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Netflix

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug