Senior Site Reliability Engineer

2 Months ago • 5 Years + • Devops

Job Summary

Job Description

Reddit is seeking a Senior Site Reliability Engineer to join their Infrastructure SRE team. The role involves improving the reliability and performance of Reddit's engineering platforms and services by leveraging knowledge of distributed systems and architecture. Responsibilities include advising engineering teams on system design, amplifying capabilities of infrastructure and platform services, automating repetitive tasks, diagnosing and fixing system issues, and optimizing performance and cost. The engineer will also own risk management, ensuring system resilience and implementing best practices. This position offers an opportunity to impact one of the internet's largest sources of information.

Must have:

5+ years of experience in SRE or DevOps
Proficiency in Go or Python
Experience with Kubernetes and Cloud systems
Knowledge of distributed systems
Experience debugging and optimizing code
Troubleshooting skills (applications, networking, systems)
Strong Linux and container knowledge
Excellent communication and collaboration skills

Good to have:

Familiarity with Prometheus, Thanos, Grafana, Vector, Clickhouse, Otel, Loki
Experience with high-traffic backend systems

Perks:

Pension Savings plan
Medical Plan
Short term sickness benefits
WIA excess and WGA gap insurance
Workspace benefits for your home office
Personal & Professional development funds
Family Planning Support
Flexible Vacation & Reddit Global Days Off

13 skills required

13 skills required for this role

Add these skills to join the top 1% applicants for this job

cross-functional

communication

problem-solving

risk-management

game-texts

quality-control

networking

linux

incident-response

prometheus

grafana

kubernetes

python

Job Details

Reddit is a community of communities. It’s built on shared interests, passion, and trust and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communities and approximately 101M+ daily active unique visitors, Reddit is one of the internet’s largest sources of information. For more information, visit redditinc.com.

Reddit SRE is rapidly innovating and our teams are working to meet the needs of infrastructure and development teams as they evolve our product faster than ever before. This is a unique opportunity to leave your mark on one of the most influential and trafficked corners of the internet.

As a Senior Site Reliability Engineer on Reddit’s Infrastructure SRE team, you’ll use your knowledge of distributed systems and architecture to improve the reliability and performance of Reddit’s engineering platforms and services. We are looking for someone who thrives at the intersection of infrastructure and software development. This team will work very closely with the Compute, Traffic, and Observability infrastructure teams. They will own a suite of tools for allowing engineers to understand their creations, based primarily on open-source solutions at scale. We’re active users of and contributors to Prometheus, Thanos, Grafana, Vector and more.

In this role, you will also take ownership of risk management, ensuring the reliability and performance of our systems. You will collaborate with cross-functional teams to identify, assess, and mitigate risks, implementing best practices to enhance system resilience. Your expertise will drive proactive measures to maintain uptime and optimize service delivery, making a significant impact on our operational excellence.

Join us and help build the future of Reddit!

Responsibilities:

Advise:

Work closely with engineering teams in designing and developing systems that are resilient and highly performant at a tremendous scale, and maintaining the foundational platform for running Reddit’s infrastructure.

Amplify:

Identify and build capabilities into our foundational Infrastructure and Platform services, which are used by Reddit engineering teams to build, deploy, and operate Reddit.
Deliver software to improve the availability, scalability, latency, and efficiency of observability components.
Identify and engineer away risk across Reddit’s systems.

Automate:

Take repetitive, manual, or risky tasks and automate them out of existence. Build tools and integrate systems to support Reddit’s evolution.
Automate critical aspects of the event driven development process

Diagnose:

Draw on your knowledge of distributed systems to identify and fix network, system, and service-level issues. Practice sustainable incident response, and drive structural improvement with blameless postmortem.
Share on-call responsibilities.

Optimize:

Observe and improve performance, reduce cost, and improve the experience for millions of users
Contribute upstream changes to the open source projects we use

Qualifications

5+ years of experience in Software Engineering, Site Reliability Engineering, or a development-focused DevOps role.
Proficiency in one or more programming languages. We’re predominantly writing code in Go and Python.
Experience with Kubernetes and Cloud systems.
Familiarity with distributed systems development, bonus if familiar with any of the specific tools (Prometheus, Thanos, Grafana, Vector, Clickhouse, Otel, Loki)
Experience with the development and operation of high-traffic backend systems.
A demonstrated ability to debug, fix, and optimize code.
Troubleshooting skills that span applications, networking (TCP/IP), and systems.
Strong working knowledge of Linux and containers.
Excellent communication and collaborative skills.

Benefits:

Pension Savings plan
Medical Plan
Short term sickness benefits
WIA excess and WGA gap insurance
Workspace benefits for your home office
Personal & Professional development funds
Family Planning Support
Flexible Vacation & Reddit Global Days Off

Reddit is proud to be an equal opportunity employer, and is committed to building a workforce representative of the diverse communities we serve. Reddit is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If, due to a disability, you need an accommodation during the interview process, please let your recruiter know.

Similar Jobs

Risk & Compliance

ElevenLabs

United Kingdom (Remote)

• 5 Months ago

Revenue Transformation Lead

Rippling

San Francisco, California, United States (On-Site)

• 2 Months ago

Lead Product Manager - Banking Payments

Tide

Sofia, Sofia City Province, Bulgaria (Hybrid)

• 3 Months ago

Synthesis Engineer, Staff

Qualcomm

Bengaluru, Karnataka, India (On-Site)

• 3 Months ago

Data Architect

truecaller

Stockholm, Stockholm County, Sweden (On-Site)

• 9 Months ago

Senior Oracle Fusion Cloud Integration Developer

ShyftLabs

Noida, Uttar Pradesh, India (Hybrid)

• 3 Months ago

Sr. Site Reliability Engineer

Visa

Ashburn, Virginia, United States (Hybrid)

• 3 Months ago

Space Infrastructure Software Engineer

Loft Orbital

San Francisco, California, United States (On-Site)

• 6 Months ago

Principal Platform Engineer

London stock Exchange

London, England, United Kingdom (On-Site)

• 3 Months ago

Senior Frontend Engineer - Apps API Platform

Canva

Auckland, Auckland, New Zealand (Remote)

• 3 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Scrum Master

Roof Stacks

Istanbul, İstanbul, Türkiye (Hybrid)

• 4 Months ago

Senior Collection Strategy System Analyst

GoTo Group

Jakarta, Indonesia (On-Site)

• 3 Months ago

Regional Sales Manager - Central

Greenworks Sunrise Global Marketing

United States (On-Site)

• 2 Months ago

Head of FinCrime, Internal Audit

OKX

San Jose, California, United States (On-Site)

• 3 Months ago

Senior Full-Stack Software Engineer ( Remote! )

Super.com

Chicago, Illinois, United States (Remote)

• 10 Months ago

Country Director, Financial Services - Global Payment (TH)

bytedance

Bangkok, Bangkok, Thailand (On-Site)

• 4 Months ago

Test Architect

TTC Global

Naperville, Illinois, United States (On-Site)

• 3 Months ago

Product Analyst/Associate, Italy

Ion

Italy (Hybrid)

• 10 Months ago

Vice President, Product Management and Engagement, Core Developer

Google

Sunnyvale, California, United States (On-Site)

• 4 Months ago

Director of IT & Security

Sporty

(Remote)

• 4 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Amsterdam, North Holland, Netherlands

Office Manager

Beyond Sports

Alkmaar, North Holland, Netherlands (On-Site)

• 4 Months ago

Publishing Manager

PUBG EMEA

Amsterdam, North Holland, Netherlands (On-Site)

• 4 Months ago

Senior Database Engineer (PostgreSQL)

Adyen

Amsterdam, North Holland, Netherlands (On-Site)

• 2 Months ago

Associate Manager, Licensing

Mattel Inc

Amstelveen, North Holland, Netherlands (On-Site)

• 2 Months ago

Marketing Manager

grendel games

Leeuwarden, Friesland, Netherlands (Hybrid)

• 3 Months ago

Business Development Representative - EMEA

Mendix

Rotterdam, South Holland, Netherlands (Hybrid)

• 2 Months ago

Consulting Director, Proactive Services

Palo Alto Networks

Netherlands (Remote)

• 1 Month ago

Principal Consultant, Incident Preparedness

Palo Alto Networks

Netherlands (Remote)

• 3 Months ago

International Marketing Internship - (Mobile) Games

GamePoint

The Hague, South Holland, Netherlands (On-Site)

• 3 Months ago

Account Manager

YouGov

Breda, North Brabant, Netherlands (Hybrid)

• 3 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Senior Infrastructure Engineer - Storage

Intel

Phoenix, Arizona, United States (On-Site)

• 2 Months ago

AWS Architect

Capgemini

Mumbai, Maharashtra, India (On-Site)

• 3 Months ago

Associate Principal Engineer, DevOps

Nagarro

India (Remote)

• 10 Months ago

Infrastructure Engineer (Headquarters)

Extreme Inc.

Toshima City, Tokyo, Japan (On-Site)

• 4 Months ago

Cloud Engineering Architect

Accurate

United States (Remote)

• 3 Months ago

Senior DevOps Programmer

Epic Games

Canada (On-Site)

• 4 Months ago

Software Development Engineer 3 - Backend (Platform)

Dream Sports

Mumbai, Maharashtra, India (On-Site)

• 5 Months ago

System Architect DevOps

Simcorp

Warsaw, Masovian Voivodeship, Poland (Hybrid)

• 2 Months ago

Senior Software Engineer, Services - Esports Platform & Experiences

Riot Games

Los Angeles, California, United States (On-Site)

• 4 Months ago

Software Engineer (SRE - Platform Services), ByteDance Infrastructure Intern - 2025 Start

bytedance

Singapore (On-Site)

• 5 Months ago

Get notifed when new similar jobs are uploaded

About The Company

77 Active Jobs

Get notified when new jobs are added by Reddit

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

A global community of game builders. Helping people upskill and land jobs in the best gaming studios.

Company

Key Links

hello@outscal.com

Made in INDIA 💛💙

Senior Site Reliability Engineer

Job Summary

Job Description

13 skills required

13 skills required for this role

Job Details

Responsibilities:

Qualifications

Similar Jobs

Risk & Compliance

Revenue Transformation Lead

Lead Product Manager - Banking Payments

Synthesis Engineer, Staff

Data Architect

Senior Oracle Fusion Cloud Integration Developer

Sr. Site Reliability Engineer

Space Infrastructure Software Engineer

Principal Platform Engineer

Senior Frontend Engineer - Apps API Platform

Similar Skill Jobs

Scrum Master

Senior Collection Strategy System Analyst

Regional Sales Manager - Central

Head of FinCrime, Internal Audit

Senior Full-Stack Software Engineer ( Remote! )

Country Director, Financial Services - Global Payment (TH)

Test Architect

Product Analyst/Associate, Italy

Vice President, Product Management and Engagement, Core Developer

Director of IT & Security

Jobs in Amsterdam, North Holland, Netherlands

Office Manager

Publishing Manager

Senior Database Engineer (PostgreSQL)

Associate Manager, Licensing

Marketing Manager

Business Development Representative - EMEA

Consulting Director, Proactive Services

Principal Consultant, Incident Preparedness

International Marketing Internship - (Mobile) Games

Account Manager

Devops Jobs

Senior Infrastructure Engineer - Storage

AWS Architect

Associate Principal Engineer, DevOps

Infrastructure Engineer (Headquarters)

Cloud Engineering Architect

Senior DevOps Programmer

Software Development Engineer 3 - Backend (Platform)

System Architect DevOps

Senior Software Engineer, Services - Esports Platform & Experiences

Software Engineer (SRE - Platform Services), ByteDance Infrastructure Intern - 2025 Start

About The Company

Ads Engineering Manager, SMB Activation

Senior Community Manager (contract)

Community Manager - France (contract)

Senior iOS Engineer - Advertiser Growth

Senior iOS Engineer - Advertiser Growth

Senior iOS Engineer - Advertiser Growth

Senior Software Engineer, Ads ML Features Platform

Senior Software Engineer, Ads ML Features Platform

Software Engineer, Ads ML Features Platform

Senior Machine Learning Engineer, Ads Training Platform

Level Up Your Career in Game Development!