Site Reliability Engineer - Remote

2 Months ago • All levels

About the job

SummaryBy Outscal

About the job:
Site Reliability Engineer with proven experience in building and maintaining scalable infrastructures, expertise in Kubernetes and cloud providers (GCP/AWS/Azure) and knowledge of incident management.
Must have:
  • Site Reliability Engineer
  • Kubernetes Experience
  • Cloud Provider Expertise
  • Incident Management
Good to have:
  • Go Language
  • GitOps Practices
  • MongoDB/Redis/MySQL
  • Open Source Contributions
Perks:
  • Remote Work
  • Drive Change Impact

Description

At EFG (ESL FACEIT Group) we create worlds beyond gameplay where players and fans become community. We pride ourselves in having a corporate social responsibility which is that “IT’S NOT GG (Good Game), UNTIL IT’S GG FOR ALL”. We are passionate about the culture we foster that ultimately helps to create and shape the world of esports, gaming tournaments, leagues, events and holistic ecosystems staged for our millions of players, fans and heroes.

The Team:

As a Site Reliability Engineer at EFG, you will be designing, analyzing, and troubleshooting large-scale distributed systems. You will demonstrate a systematic problem-solving approach, and the ability to debug and optimize code and to automate routine tasks. You will ensure that EFG’s services and systems are reliable, that they have uptime appropriate to users' needs and they have a fast rate of improvement. 

Apart from monitoring our systems' capacity and performance, you will also focus on optimizing existing systems, on building infrastructure and on eliminating work through automation.  You will work collaboratively with the software engineering teams to deploy and operate our systems, and you will help to automate and streamline our operations and processes. Within this role, you will be given real responsibilities, and you have the opportunity to drive change and have a big impact on our products and platform.

What you will do:

  • Maintaining and improving the monitoring and observability tools (Grafana/Prometheus/Thanos/Jaeger);
  • Working closely with your team and with other cross-functional teams to help design, maintain and operate systems at scale;
  • Developing and driving adoption of SRE best practices across the company;
  • Leading on incident management process and adoption;
  • Using your troubleshooting skills to help identify and fix operational issues;
  • Working with Cloud Native technologies such as Kubernetes, Envoy, Istio, Prometheus and Helm;
  • Working with the “Hashi Stack” (terraform, packer, vault);
  • Experimenting with and introducing cutting edge technologies.

Requirements

  • Proven experience as a Site Reliability Engineer, DevXP Engineer or Software Engineer, focusing on building and maintaining scalable infrastructures;
  • Excellent working knowledge on at least one of the major cloud providers (GCP/AWS/Azure);
  • You have experience with cluster management systems (Kubernetes);
  • Knowledge of incident management: ability to investigate, troubleshoot, recover and prevent the recurrence of incidents that interfere with the normal delivery of IT services;
  • Proficient in Go language and some level of proficiency in at least another language: Java, Python, Rust…;
  • You have knowledge of GitOps practices;
  • You have production scale experience with one of the following; MongoDB, Redis, MySQL;
  • Experience contributing to open source technologies would be an added bonus.
View Full Job Description

About The Company

The ESL FACEIT Group is the leading competitive games and esports company.


We help brands embrace the youth culture phenomenon of our times. Through exposure and authentic activation, we enable brands to engage the notoriously hard-to-reach global youth audiences capable of driving their growth today and tomorrow.


At EFG we create worlds beyond gameplay where players and fans become community. The company is built on the great legacy of the world-renowned ESL, FACEIT and DreamHack brands.


We harness this legacy to further innovate and develop the esports and gaming landscape worldwide. Working with our developer, publisher, brand, and media partners, we deliver products that accelerate gaming culture and make gamer communities come together.


Join us on the journey as we help gaming communities thrive by creating worlds beyond gameplay that unite players, fans, and creators around the esports and games they love.

England, United Kingdom (On-Site)

United States (On-Site)

Worldwide (Remote)

England, United Kingdom (Hybrid)

New York, United States (Hybrid)

View All Jobs

Similar Skill Jobs

Blizzard Entertainment - Senior Manager, Mobile Growth Strategy | Irvine, CA

California, United States (Hybrid)

Starkflow - Java/Groovy Developer

Morocco (Remote)

wmeimg - Art Director – FIFA World Cup 26™

New York, United States (Hybrid)

Patreon - Staff Data Engineer, Analytics

California, United States (Hybrid)

Patreon - Staff Data Engineer, Analytics

New York, United States (Hybrid)

Xsolla - Junior Data Scientist

Lisbon, Portugal (Hybrid)

Xsolla - Junior Data Scientist

Belarus (Hybrid)

Software Engineering Jobs

DraftKings - Manager, Lottery Fulfillment

New Jersey, United States (On-Site)

Trek - Production Tech

Utah, United States (On-Site)

Scientific Games  - Machine Operator

Georgia, United States (On-Site)

company3methodstudios - Vault Assistant

Georgia, United States (On-Site)

Xsolla - VP of Architecture

Quebec, Canada (Hybrid)

Fortis Games - IT Support Engineer

Romania (Remote)

PlayStation Global - Sr. Director, Data Platform Engineering & Operations

California, United States (On-Site)

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug