Senior Observability Engineer

23 Minutes ago • 4-8 Years • Backend Development

About the job

Job Description

The Senior Observability Engineer at Epic Games will be responsible for building and operating the infrastructure used by teams to maintain platforms, games, and online services. This involves working across the company to implement best practices, develop new monitoring capabilities, and process large volumes of telemetry data. Responsibilities include service ownership, development and deployment of new data processing pipelines, automation of processes, collaborating with teams as an observability expert, and modernizing observability infrastructure. The role requires experience with large-scale systems in AWS (Kubernetes, Terraform), familiarity with monitoring technologies (OpenTelemetry, Prometheus, Grafana, etc.), and a capacity to work effectively in a fast-paced environment.
Must have:
  • Experience with large-scale AWS systems (Kubernetes)
  • Proficiency in Terraform
  • Familiarity with monitoring technologies (OpenTelemetry, Prometheus, Grafana)
  • Ability to work in a fast-paced environment
  • Service ownership mentality
Perks:
  • 100% premium coverage for medical, dental, vision
  • Long-term disability, life insurance
  • 401k with competitive match
  • Robust mental well-being program
  • Unlimited PTO and sick time
  • Paid sabbatical after 7 years

WHAT MAKES US EPIC?

At the core of Epic’s success are talented, passionate people. Epic prides itself on creating a collaborative, welcoming, and creative environment. Whether it’s building award-winning games or crafting engine technology that enables others to make visually stunning interactive experiences, we’re always innovating.

Being Epic means being a part of a team that continually strives to do right by our community and users. We’re constantly innovating to raise the bar of engine and game development.

ONLINE INFRASTRUCTURE

What We Do

We enable Epic’s online services teams to build, deploy, and manage services that are used by more than half a billion players around the world. Our mission is to provide world class tools and platforms to improve the experience of our developers and make it easier, faster, and safer to build, operate, and scale their applications. We operate at massive scale as one of the largest cloud computing users in the world.

What You'll Do

Our Observability team is looking for a Senior SRE to help us build and operate the infrastructure our teams rely on to keep our platforms, games, and online services running. Our Observability team works across all of Epic to implement industry best practices and develop new monitoring capabilities. As an SRE on Observability, you will tackle problems that impact how we understand and operate our products at scale. This team is responsible for company-wide metrics, logging, exception handling, and dashboarding solutions. In this role, you will build and operate the systems that process and transport the large volumes of telemetry data generated by services at Epic.

In this role, you will

  • Service Ownership - At Epic we embrace a Service Owner (You build it, you run it) mentality. In this role, you will work together with other members of the Observability team to operate the infrastructure our developers depend on to operate their own services.
  • Develop and Ship - You will work to modernize key portions of our observability infrastructure. Building new data processing pipelines for telemetry data as well as writing software to automate processes and generate new insights.
  • Collaborate - You will work with teams across Epic as an observability subject matter expert to provide guidance on observability best practices.

What we're looking for

  • Experience with executing meaningful change in a fast-paced interrupt driven environment.
  • Self-starter, you approach challenges creatively and methodically, seeing them through to final resolution.
  • Ability to adapt and be effective in new situations within a highly dynamic environment.
  • Experience working with large scale systems in AWS, mostly deployed via Kubernetes.
  • Comfortable in a very terraform heavy environment, both reviewing PRs as well as contributing yourself.
  • Are familiar with application/service monitoring strategies and technologies, examples include OpenTelemetry, Prometheus, Grafana, FluentD, New Relic, Datadog, Grafana, Sentry, and Sumo Logic.

EPIC JOB + EPIC BENEFITS = EPIC LIFE

Our intent is to cover all things that are medically necessary and improve the quality of life. We pay 100% of the premiums for both you and your dependents. Our coverage includes Medical, Dental, a Vision HRA, Long Term Disability, Life Insurance & a 401k with competitive match. We also offer a robust mental well-being program through Modern Health, which provides free therapy and coaching for employees & dependents. Throughout the year we celebrate our employees with events and company-wide paid breaks. We offer unlimited PTO and sick time and recognize individuals for 7 years of employment with a paid sabbatical.

ABOUT US

Epic Games spans across 25 countries with 46 studios and 4,500+ employees globally. For over 25 years, we've been making award-winning games and engine technology that empowers others to make visually stunning games and 3D content that bring environments to life like never before. Epic's award-winning Unreal Engine technology not only provides game developers the ability to build high-fidelity, interactive experiences for PC, console, mobile, and VR, it is also a tool being embraced by content creators across a variety of industries such as media and entertainment, automotive, and architectural design. As we continue to build our Engine technology and develop remarkable games, we strive to build teams of world-class talent.

Like what you hear? Come be a part of something Epic!

Epic Games deeply values diverse teams and an inclusive work culture, and we are proud to be an Equal Opportunity employer. Learn more about our Equal Employment Opportunity (EEO) Policy .

Note to Recruitment Agencies: Epic does not accept any unsolicited resumes or approaches from any unauthorized third party (including recruitment or placement agencies) (i.e., a third party with whom we do not have a negotiated and validly executed agreement). We will not pay any fees to any unauthorized third party. Further details on these matters can be found .

View Full Job Description

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

Founded in 1991, Epic Games is a leading interactive entertainment company and provider of 3D engine technology. Epic operates Fortnite, one of the world’s largest games with over 350 million accounts and 2.5 billion friend connections. Epic also develops Unreal Engine, which powers the world’s leading games and is adopted across industries such as film and television, architecture, automotive, manufacturing, and simulation. Through Unreal Engine, Epic Games Store, and Epic Online Services, Epic provides an end-to-end digital ecosystem for developers and creators to build, distribute, and operate games and other content. Epic has over 40 offices worldwide with headquarters in Cary, North Carolina.

United States (On-Site)

Cary, North Carolina, United States (On-Site)

London, England, United Kingdom (On-Site)

Montreal, Quebec, Canada (On-Site)

Cary, North Carolina, United States (On-Site)

Porto Alegre, State Of Rio Grande Do Sul, Brazil (On-Site)

Cary, North Carolina, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Epic Games

Similar Jobs

IGT - Cloud Operations Engineer II

IGT, United States (On-Site)

ByteDance - Site Reliability Engineer, Traffic Platform

ByteDance, United States (On-Site)

Trend Micro - Sr. Engineer

Trend Micro, Taiwan (On-Site)

Playtech - Software Developer - Algos

Playtech, United Kingdom (On_site)

AI Fund - Backend Engineer

AI Fund, (Remote)

Fliff  Inc  - Senior Software Engineer

Fliff Inc , Bulgaria (On-Site)

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

N-iX - LEAD JAVA ENGINEER (#2676)

N-iX, Poland (Remote)

Wargaming - Solutions Architect (AI Technology)

Wargaming, Czechia (Hybrid)

PAPAYA - Senior DevOps Engineer- SHINE

PAPAYA, Israel (On-Site)

Nielsen Holdings - Scala Developer

Nielsen Holdings, India (On-Site)

Reversing Labs - DevOps Engineer- CI/CD & Cloud Specialist

Reversing Labs, Croatia (Hybrid)

Microsoft - Senior Software Engineer-Xbox

Microsoft, (On-Site)

PAPAYA - Senior Backend Engineer

PAPAYA, Israel (On-Site)

Microsoft - Software Engineer II - AI/ML

Microsoft, Ireland (On-Site)

Get notifed when new similar jobs are uploaded

Jobs in Cary, North Carolina, United States

Luxoft - Senior Java Developer

Luxoft, United States (Remote)

Life church - IT Hardware Support Engineer

Life church, United States (On-Site)

Samsung Semiconductor - Engineer, DRAM Applications

Samsung Semiconductor, United States (On-Site)

WongDoody - FUTURE OPPORTUNITIES @ WONGDOODY

WongDoody, United States (On-Site)

PlayStation Global - Senior Software Engineer

PlayStation Global, United States (On-Site)

seeking alpha - Analysis Editor

seeking alpha, United States (Remote)

CD PROJEKT RED - Engineering Director, Network

CD PROJEKT RED, United States (On-Site)

ION - Senior Technical Consultant - Endur

ION, United States (On-Site)

Blazesoft - Online Casino Program Manager

Blazesoft, United States (On-Site)

Netflix - Finance Program Manager

Netflix, United States (Hybrid)

Get notifed when new similar jobs are uploaded

Backend Development Jobs

KBG Blockchain Game Studios - Blockchain Developer (BSC)

KBG Blockchain Game Studios, Vietnam (On-Site)

Red Rover Interactive - Senior Server programmer

Red Rover Interactive, Norway (Hybrid)

Epic Games - Senior Developer Relations Engineer

Epic Games, United Kingdom (On-Site)

Electronic Arts - Software Engineer

Electronic Arts, Romania (Remote)

ESL FACEIT Group - EFG - Senior Software Engineer - Backend (Go)

ESL FACEIT Group - EFG, United Kingdom (Remote)

Playrix - Tech Lead Full Stack (Web Services)

Playrix, Cyprus (Remote)

Get notifed when new similar jobs are uploaded