Staff Software Engineer - Infrastructure Reliability

2 Months ago • 7 Years + • DevOps

Job Summary

Job Description

As a Staff Software Engineer on the Infrastructure Reliability team at Riot Games, you'll ensure the scalability, availability, and performance of game infrastructure. This role requires strong coding skills, automation passion, and a focus on reliability engineering. Responsibilities include automating infrastructure management, building CI/CD pipelines, architecting cloud-based (AWS, GCP) and hybrid infrastructure solutions, designing and implementing infrastructure systems, mentoring junior engineers, and providing thought leadership. You'll work with Kubernetes, Docker, Terraform, and other tools to build robust and maintainable systems.
Must have:
  • 7+ years software engineering experience
  • AWS expertise (Lambda, API Gateway, EKS, S3)
  • Automation & scripting (Python, Golang)
  • IaC (Terraform, Cloudformation)
  • CI/CD (Jenkins, Harness, GitHub Actions)
  • Containerization (Docker, Kubernetes)
  • Monitoring & Logging tools (Prometheus, Grafana)
  • Team leadership & mentorship
Good to have:
  • GCP knowledge
  • Java knowledge
  • Pulumi experience
  • CDN, WAF, AWS firewall experience
  • Database (SQL, NoSQL) knowledge
  • Networking foundations
Perks:
  • Open paid time off policy
  • Flexible work schedules
  • Medical, dental, and life insurance
  • Parental leave
  • 401k with company match

Job Details

Riot Engineers bring deep knowledge of specific technical areas but also value the chance to work in many broader domains. As a Software Engineer, you’ll also dive into projects that focus on team cohesiveness and cross-team objectives. You’ll lead without authority and provide other engineers with a clear illustration of extraordinary engineering.

As a Staff Software Engineer on the Infrastructure Reliability team, you will be a critical part of our efforts to ensure the scalability, availability, and performance of our game infrastructure. This role demands strong coding skills, a passion for automation, and a focus on reliability engineering to deliver robust and maintainable systems. You will work on implementing infrastructure as code, developing self-healing systems, and creating tools to enhance observability and streamline troubleshooting for core infrastructure services. This role typically combines technical expertise with leadership responsibilities and requires a strong understanding of distributed systems, DevOps practices, and software development.

Responsibilities: 

  • Automation & DevOps: Drive the automation of infrastructure management, deployment pipelines, and system monitoring. Build and maintain CI/CD pipelines to ensure efficient delivery of code to production.
  • Cloud & On-Premises Management: Architect, implement, and manage cloud-based (AWS, GCP) and hybrid infrastructure solutions. Oversee container orchestration using Kubernetes, Docker, or similar technologies
  • Infrastructure Design & Development: Design and implement infrastructure systems to support large-scale, high-availability services. Develop and maintain tools for infrastructure provisioning, monitoring, and management.
  • Technical Leadership: Mentor and guide junior engineers, fostering a culture of collaboration and technical excellence. Provide thought leadership on infrastructure trends and technologies to influence the organization’s roadmap.

Required Qualifications:

  • 7+ years of experience in software engineering supporting large-scale infrastructure
  • Expertise in the public cloud:  AWS ecosystem, including serverless services (e.g., Lambda, API Gateway), container orchestration with Kubernetes (EKS), and foundational services (e.g., S3, VPC, EBS, Firewalls). Knowledge of GCP is plus
  • Automation and Scripting: Proficiency in scripting and programming languages like Python and Golang to drive automation, manage deployments, and create tooling. Knowledge of Java is plus
  • Infrastructure as Code (IaC): Extensive hands-on experience with Terraform, Cloudformation or similar infrastructure provisioning and configuration management. Knowledge of Pulumi for cloud infrastructure in code is a plus.
  • CI/CD Expertise: Proven experience working with CI/CD pipelines including tools like Jenkins,Harness and GitHub actions, emphasizing deployment reliability and automation.
  • Containerization: Expertise in container management and orchestration with Docker and Kubernetes, and experience designing robust microservices infrastructure. Strong understanding of configuration formats such as JSON and YAML and their application in IaC, Kubernetes manifests, and other deployment files.
  • Tools and Systems : Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) and deep understanding of distributed systems, networking, and storage solutions
  • Adaptability: Ability to quickly adopt and adapt to new technologies, frameworks, and cloud-native tools to solve complex problems.
  • Team Leadership: Proven experience in guiding delivery goals across teams, advocating for best practices, and driving alignment on cross-initiative projects and initiatives.

Desired Qualifications: 

  • Proven experience leading and mentoring a team of engineers, fostering collaboration and technical growth.
  • Good understanding of CDN, WAF and AWS firewalls.
  • Familiarity with databases (SQL and NoSQL) and networking foundations.

For this role, you'll find success through craft expertise, a collaborative spirit, and decision-making that prioritizes your fellow Rioters, who are the customers of your work. Being a dedicated fan of games is not necessary for this position!

Our Perks:

Riot focuses on work/life balance, shown by our open paid time off policy and other perks such as flexible work schedules. We offer medical, dental, and life insurance, parental leave for you, your spouse/domestic partner, and children, and a 401k with company match. Check out our for more information.

Riot Games fosters a player and workplace experience that values teamwork embodied by the and . Our culture embraces differences as a strength, and our values are the guiding principles for how we approach work. We are committed to putting diversity and inclusion (D&I) at the center of everything we do, and promoting a fair and collaborative culture where Rioters treat one another with dignity and respect. We encourage you to read more about our value of and our ongoing work to build the .

It’s our policy to provide equal employment opportunity for all applicants and members of Riot Games, Inc. Riot Games makes reasonable accommodations for handicapped and disabled Rioters and does not unlawfully discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, handicap, veteran status, marital status, criminal history, or any other category protected by applicable federal and state law. We consider for employment all qualified applicants, including those with criminal histories, in a manner consistent with applicable federal, state and local law, including the California Fair Chance Act, the City of Los Angeles Fair Chance Initiative for Hiring Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, the San Francisco Fair Chance Ordinance, and the Washington Fair Chance Act.

Per the Los Angeles County Fair Chance Ordinance, the following core duties may create a basis for disqualifying candidates with relevant criminal histories:

  • Safeguarding confidential and sensitive Company data
  • Communication with others, including Rioters and third parties such as vendors, and/or players, including minors
  • Accessing Company assets, secure digital systems, and networks
  • Ensuring a safe interactive environment for players and other Rioters

These duties are directly related to essential operations, safety, trust, and compliance obligations within our organization. Please note that job duties may evolve based on business needs and additional responsibilities may be assigned as necessary to maintain operational efficiency and security. 

Similar Jobs

Activision - Senior Network Engineer

Activision

Vancouver, British Columbia, Canada (On-Site)
1 Month ago
Warner Bros Games - Staff Software Engineer - Golang - QoE Platform

Warner Bros Games

Bengaluru, Karnataka, India (Hybrid)
3 Weeks ago
NVIDIA - Senior Staff Application Engineer

NVIDIA

Santa Clara, California, United States (Hybrid)
2 Months ago
Aristocrat Gaming - DevOps Engineer

Aristocrat Gaming

Kraków, Lesser Poland Voivodeship, Poland (Hybrid)
3 Months ago
PlayStation Global - Senior Full Stack Software Engineer - Golang

PlayStation Global

Carlsbad, California, United States (On-Site)
6 Days ago
ByteDance - Software Engineer, SRE - Platform Services

ByteDance

Seattle, Washington, United States (On-Site)
1 Month ago
ByteDance - Site Reliability Engineer (Cloud Native Platform) - Traffic Infrastructure

ByteDance

San Jose, California, United States (On-Site)
2 Months ago
Nielsen Holdings - Senior Software Engineer - Bigdata ( Java/Scala , Spark, SQL , AWS)

Nielsen Holdings

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
Razer - Software Engineer (DevOps)

Razer

Shah Alam, Selangor, Malaysia (On-Site)
6 Months ago
Epic Games - Senior DevOps Programmer

Epic Games

Cary, North Carolina, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

NVIDIA - Technical Marketing Software Engineer Intern - Summer 2025

NVIDIA

Santa Clara, California, United States (On-Site)
3 Days ago
PwC - Senior Associate_Full Stack Developer_Data & Analytics_Advisory_PAN  India

PwC

Kolkata, West Bengal, India (On-Site)
6 Months ago
ByteDance - Site Reliability Engineer - CapCut - San Jose/Seattle

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
CD PROJEKT RED - Software Engineer

CD PROJEKT RED

Warsaw, Masovian Voivodeship, Poland (Hybrid)
3 Weeks ago
Sporty Group - LatAM Site Reliability Engineer

Sporty Group

(On-Site)
10 Months ago
Zazz - Artificial Intelligence Engineer

Zazz

(Remote)
1 Month ago
NVIDIA - Performance Engineer Intern, Deep Learning and HPC

NVIDIA

Shanghai, Shanghai, China (On-Site)
1 Month ago
ByteDance - Senior Software Engineer - Generative AI

ByteDance

San Jose, California, United States (On-Site)
2 Months ago
PlayStation Global - Senior Platform Engineer

PlayStation Global

Adelaide, South Australia, Australia (On-Site)
6 Days ago
Microsoft - Member of Technical Staff, AI Data

Microsoft

London, England, United Kingdom (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

Jobs in Los Angeles, California, United States

Bethesda - Animation Programmer

Bethesda

Rockville, Maryland, United States (On-Site)
8 Months ago
Keywords Studios (Player Support) - Pharmaceutical Warehouse Associate

Keywords Studios (Player Support)

Indianapolis, Indiana, United States (On-Site)
6 Days ago
Magic Leap - Principal Product Design Architect

Magic Leap

Plantation, Florida, United States (On-Site)
3 Months ago
DraftKings - Operations Associate

DraftKings

Portland, Oregon, United States (On-Site)
1 Month ago
ByteDance - Researcher - Interdisciplinary

ByteDance

New York, New York, United States (On-Site)
6 Days ago
ByteDance - Student Researcher (Doubao (Seed) Foundation Model - Video Generation) - 2025 Start (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
5 Months ago
ION - Senior Private Credit Reporter/Deputy Editor - Debtwire North America

ION

New York, New York, United States (On-Site)
5 Months ago
Universal Music - Coordinator, Urban Marketing

Universal Music

Santa Monica, California, United States (On-Site)
2 Weeks ago
NVIDIA - Senior System Software Engineer - Dynamo and Triton Inference Server

NVIDIA

California, United States (Remote)
5 Days ago
Riot Games - Staff Software Engineer - League of Legends, Champions

Riot Games

Los Angeles, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

NVIDIA - Senior HPC AI Cluster Engineer

NVIDIA

Yokne'am Illit, North District, Israel (On-Site)
2 Months ago
Rackspace Technology - DEVOP Engineer (AWS Terraform)-PSDE III

Rackspace Technology

India (Remote)
4 Months ago
GoTo Group - Principal SRE Engineer (SE5)

GoTo Group

Bengaluru, Karnataka, India (On-Site)
5 Months ago
ByteDance - Cloud Site Reliability Engineer

ByteDance

San Jose, California, United States (On-Site)
1 Week ago
Saviynt - Senior Principal Software Engineer - Privileged Access Management (PAM)

Saviynt

El Segundo, California, United States (Hybrid)
5 Months ago
Argus Labs - Site Reliability Engineer (LATAM)

Argus Labs

(Remote)
1 Week ago
Malabar Gold & Diamonds - Executive - Cloud Engineer

Malabar Gold & Diamonds

Sri Vijaya Puram, Andaman And Nicobar Islands, India (On-Site)
8 Months ago
Auros Global - Senior Site Reliability Engineer

Auros Global

United Kingdom (Remote)
2 Weeks ago
Playtech - Integration Engineer

Playtech

Tallinn, Harju County, Estonia (On-Site)
4 Days ago
PwC - AWS DataOps Engineer

PwC

Bengaluru, Karnataka, India (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Riot Games is a video game developer, publisher, and esports tournament organizer best known for League of Legends.

Los Angeles, California, United States (On-Site)

Los Angeles, California, United States (On-Site)

Dublin, County Dublin, Ireland (On-Site)

Los Angeles, California, United States (On-Site)

Los Angeles, California, United States (On-Site)

State Of São Paulo, Brazil (On-Site)

Los Angeles, California, United States (On-Site)

Sydney, New South Wales, Australia (On-Site)

Los Angeles, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Riot Games

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug