Senior Site Reliability Engineer

2 Months ago • 7 Years + • Devops • $98,400 PA - $145,620 PA

Job Summary

Job Description

2K Games is seeking a Senior Site Reliability Engineer (SRE) with extensive knowledge in Unix/Linux systems, distributed infrastructure, and automation tools to enhance and maintain critical platforms serving millions of users globally. The role involves building resilient, high-performance services for live gaming environments, balancing system stability, scalability, and operational efficiency. Responsibilities include managing complex technology stacks across AWS, GCP, and on-premise environments, building auto-scaling systems, optimizing OS internals, integrating authentication, leading high-availability architecture design, implementing disaster recovery, and defining observability standards using tools like Datadog and Grafana. The SRE will also participate in on-call rotations, handle critical incidents, lead post-mortems, and design preventative solutions. Automation is a key aspect, involving infrastructure-as-code with Terraform, Puppet, and Ansible, as well as scripting with Python and Shell. Collaboration with development teams to embed reliability, mentoring other engineers, and sharing expertise in debugging and system architecture are also crucial. The position requires a strong understanding of Unix/Linux systems, performance tuning, distributed applications in cloud environments, and infrastructure automation.
Must have:
  • 7+ years in SRE/Infrastructure/Systems Engineering
  • Deep Unix/Linux systems expertise
  • Kernel tuning and performance profiling
  • 6+ years in AWS/GCP with distributed apps
  • Advanced Python and Shell scripting
  • Proficiency with Terraform, Ansible, Puppet
  • Hybrid infrastructure experience (VMware, containers, Kubernetes)
  • Hands-on monitoring and observability experience
Good to have:
  • Experience supporting live game services
  • Open-source contributions
  • Familiarity with telemetry pipelines (ETL, Flink, Kafka, Kinesis)
  • Kubernetes-native tooling and service meshes
  • Operational knowledge of MySQL/Postgres
Perks:
  • Bonus and/or equity awards
  • 401(K) plan
  • Employee Stock Purchase Program
  • Medical, dental, vision insurance
  • Basic life insurance
  • 14 paid holidays
  • Paid vacation time
  • Paid sick days
  • Paid parental and compassionate leave
  • Wellbeing programs
  • Family planning support
  • Commuter benefits
  • Fitness-related expense reimbursements

Job Details

 
#LI-Onsite 

On-Call Requirement: Yes (Periodic Rotation)

Who We Are

 2K is headquartered in Novato, California and is a wholly owned label of Take-Two Interactive Software, Inc. (NASDAQ: TTWO). Founded in 2005, 2K Games is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our studios responsible for developing 2K’s portfolio of world-class games across multiple platforms, include Visual Concepts, Firaxis, Hangar 13, CatDaddy, Cloud Chamber, 31st Union, HB Studios, and 2K SportsLab. Our portfolio of titles is expanding due to our global strategic plan, building and acquiring exciting studios whose content continues to inspire all of us! 2K publishes titles in today’s most popular gaming genres, including sports, shooters, action, role-playing, strategy, casual, and family entertainment.

 Our team of engineers, marketers, artists, writers, data scientists, producers, thinkers and doers, are the professional publishing stewards of 2K’s portfolio currently includes several AAA, sports and entertainment brands, including global powerhouse NBA®️ 2K,  renowned BioShock®️, Borderlands®️, Mafia, Sid Meier’s Civilization®️ and XCOM®️ brands; popular WWE®️ 2K and WWE®️ SuperCard franchises, TopSpin 2K25, as well as the critically and commercially acclaimed PGA TOUR®️ 2K

 At 2K, we pride ourselves on creating an inclusive work environment, which means encouraging our teams to Come as You Are and do your best work! We encourage ALL applicants to explore our global positions, even if they don’t meet every requirement for the role.  If you're interested in the job and think you have what it takes to work at 2K, we encourage you to apply!

 

What We Need

 We are seeking a Senior Site Reliability Engineer (SRE) with deep expertise in Unix/Linux systems architecture, distributed infrastructure, and automation tooling to help scale and sustain mission-critical platforms that serve millions of active users worldwide. You’ll play a leading role in building resilient, high-performance services for live gaming environments—balancing system stability, scalability, and operational velocity.


  As part of our SRE team, you’ll work across a complex technology stack spanning AWS, GCP, and hybrid on-prem environments. You’ll be responsible for building auto-scaling, self-healing Unix-based systems, optimizing OS internals, and integrating authentication across enterprise identity systems. You’ll lead the design of high-availability architecture, implement disaster recovery, apply advanced performance tuning across kernel, network, and filesystem layers, and define/enforce observability standards using Datadog, Grafana, and open-source telemetry tools. Your efforts will power real-time insights, automated alerting, and rapid incident detection and resolution. As a senior member of the on-call rotation, you’ll handle critical outages, lead post-mortems, and design long-term preventative solutions.


 Automation is foundational to this role. You’ll build and maintain infrastructure-as-code (IaC) with tools like Terraform, puppet, and Ansible, orchestrating deployments, configurations, and updates across heterogeneous environments. You’ll extend platform APIs and backend tooling using Python, and Shell scripts, driving continuous improvement in platform delivery.


 Collaboration is key: You’ll partner with backend and gameplay engineers to embed reliability into every layer of the tech stack. You’ll contribute to shared reliability standards, CI/CD integration pipelines, provisioning templates, and internal documentation. As a mentor, you’ll share your expertise in debugging, system architecture, and tooling best practices, empowering engineers across disciplines to build complex resilient systems.

 

What You’ll Do

Systems Design, Scaling & Resilience

  • Design and operate distributed Unix-based systems (Red Hat, Ubuntu, Debian, CentOS).
  • Implement auto-scaling and self-healing infrastructure to ensure uptime and durability.
  • Tune system internals including kernel parameters, networking, and filesystems for high performance.
  • Maintain timely OS patching and compliance posture across environments.
  • Integrate systems with enterprise identity services such as Active Directory, LDAP, and Kerberos.

Automation & Infrastructure as Code

  • Build and maintain infrastructure automation using Terraform, puppet, Ansible.
  • Automate deployment pipelines, service configurations, and patch management.
  • Develop scripts and services in Python, and Bash/Shell to enhance infrastructure delivery workflows.
  • Extend APIs and platform automation to drive efficiency and repeatability.

Observability, Monitoring & Incident Response

  • Develop observability stacks using Datadog, Prometheus, Grafana, and open-source telemetry tools.
  • Create dashboards and SLO/SLI-based alerts for real-time monitoring of production systems.
  • Participate in a global 24/7 on-call rotation, leading response for high-severity incidents.
  • Conduct post-incident analysis (RCA) and drive remediations that improve long-term reliability.

Multi-Cloud & Hybrid Platform Engineering

  • Manage workloads across AWS, GCP, and on-prem infrastructure.
  • Design and implement multi-region failover, load balancing, and disaster recovery strategies.
  • Work with both VM-based and containerized/Kubernetes platforms including vSphere/VMware.
  • Support backup, restore, and DR tooling with strict availability targets.

Collaboration, Standards & Enablement

  • Partner with development teams to embed reliability in deployment pipelines.
  • Help define system architecture standards and maintain robust platform documentation.
  • Mentor engineers in Unix performance, observability, and debugging practices.
  • Champion a culture of automation, resilience, and continuous improvement.

What Will Make You A Great Fit

  • 7+ years in SRE, Infrastructure, or Systems Engineering roles managing production services.
  • Deep expertise with Unix/Linux systems including Red Hat, Debian, Ubuntu, and CentOS.
  • Experience in kernel tuning, performance profiling, and debugging complex system issues.
  • 6+ years working in AWS and/or GCP with large-scale, distributed applications.
  • Advanced skills in Python, Shell scripting, and optionally Go or Ruby.
  • Strong grasp of IaC tools like Terraform, Ansible, and puppet.
  • Experience running hybrid infrastructure (cloud/on-prem) with VMware, containers, and Kubernetes.
  • Hands-on experience with monitoring, telemetry, and observability stacks.

Additional qualities

  • Experience supporting live game services or other high-throughput, low-latency platforms.
  • Contributions to open-source tooling in observability, automation, or infrastructure domains.
  • Familiarity with telemetry pipelines like ETL, Flink, Kafka, or Kinesis.
  • Experience with Kubernetes-native tooling and service meshes (e.g., Istio, Linkerd).
  • Operational knowledge of MySQL/Postgres in cloud-native and bare-metal deployments.

 You thrive in collaborative environments that value technical skill and operational excellence. Your passion for high-quality infrastructure empowers development teams and enhances productivity.

 As an equal opportunity employer, we are committed to ensuring that qualified individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform their essential job functions, and to receive other benefits and privileges of employment. Please contact us if you need reasonable accommodation.

 Please note that 2K Games and its studios never uses instant messaging apps or personal email accounts to contact prospective employees or conduct interviews and when emailing, only use 2K.com accounts.


The pay range for this position in California at the start of employment is expected to be between $98,400 and $145,620 per Year. However, base pay offered is based on market location, and may vary further depending on individualized factors for job candidates, such as job-related knowledge, skills, experience, and other objective business considerations. Subject to those same considerations, the total compensation package for this position may also include other elements, including a bonus and/or equity awards and eligibility to participate in our 401(K) plan and Employee Stock Purchase Program. Regular, full-time employees are also eligible for a range of benefits at the Company, including: medical, dental, vision, and basic life insurance coverage; 14 paid holidays per calendar year; paid vacation time per calendar year (ranging from 15 to 25 days) or eligibility to participate in the Company’s discretionary time off program; up to 10 paid sick days per calendar year; paid parental and compassionate leave; wellbeing programs for mental health and other wellness support; family planning support through Maven; commuter benefits; and reimbursements for fitness-related expenses.

Similar Jobs

Intel  - Senior Infrastructure Engineer - Windows OS

Intel

Phoenix, Arizona, United States (On-Site)
1 Year ago
Interactive Brokers - APAC Risk Assessment Analyst

Interactive Brokers

Kowloon, Hong Kong (On-Site)
1 Month ago
Clearwater Analytics - Finance Integration & Automation Leader

Clearwater Analytics

Noida, Uttar Pradesh, India (On-Site)
1 Year ago
Adyen - Partner Solutions Engineer

Adyen

Mexico City, Mexico (On-Site)
1 Month ago
eBay - Technical Lead - Connected Products

eBay

Manchester, England, United Kingdom (Hybrid)
1 Year ago
Reddit - Principal Software Engineer, ML Feature Platform

Reddit

United States (Remote)
3 Months ago
Thales - DevOps Tech Lead

Thales

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
Dream Sports - Director - DevOps

Dream Sports

Mumbai, Maharashtra, India (On-Site)
5 Months ago
Zenoti - Lead Site Reliability Engineer

Zenoti

Hyderabad, Telangana, India (On-Site)
11 Months ago
Remedy Entertainment Plc - Senior/Lead DevOps Engineer

Remedy Entertainment Plc

Helsinki, Uusimaa, Finland (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Ethos Life - Staff Product Analyst, Consumer

Ethos Life

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
Saviynt - Senior Business Intelligence Analyst, Operational Analytics

Saviynt

Bengaluru, Karnataka, India (Hybrid)
1 Month ago
FlockSafety - Installation Technician - Charleston, SC

FlockSafety

Charleston, South Carolina, United States (Remote)
3 Weeks ago
Brillio - DB Migration Engineer - R01531207

Brillio

Bengaluru, Karnataka, India (Hybrid)
10 Months ago
Landor - Strategy Director

Landor

Hong Kong, Hong Kong (On-Site)
3 Months ago
Zynga - Senior Manager, Engineering

Zynga

Bengaluru, Karnataka, India (On-Site)
3 Months ago
miniclip - Senior Site Reliability Engineer

miniclip

Lisbon, Lisbon, Portugal (On-Site)
1 Month ago
Sword Health - Senior AI Product Manager

Sword Health

Porto, Porto District, Portugal (Hybrid)
7 Months ago
zoox - Senior Database Engineer

zoox

Foster City, California, United States (Hybrid)
4 Months ago
4j studios - Senior Software Developer (Graphics)

4j studios

Dundee, Scotland, United Kingdom (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Novato, California, United States

HCL Tech - CRM Specialist

HCL Tech

Mountain View, California, United States (Remote)
3 Months ago
ember entertainment  - Terms of Service

ember entertainment

United States (On-Site)
2 Months ago
Sesame - Embedded OS Architect

Sesame

San Francisco, California, United States (On-Site)
6 Months ago
Super.com - Senior Full-Stack Software Engineer ( Remote! )

Super.com

Boston, Massachusetts, United States (Remote)
9 Months ago
PayPal - Specialist, Executive Onboarding

PayPal

San Jose, California, United States (Hybrid)
2 Months ago
Riot Games - Associate Art Director, Characters - Unpublished R&D Product

Riot Games

Los Angeles, California, United States (On-Site)
9 Months ago
JMA - Regional Sales Director - DAS - Pacific Northwest

JMA

United States (Remote)
10 Months ago
Sportradar - Synergy Sports Showcase Softball/Baseball Video Scouts

Sportradar

California City, California, United States (On-Site)
9 Months ago
Warner Bros Discovery - Senior Audio Software Engineer

Warner Bros Discovery

Salt Lake City, Utah, United States (Hybrid)
1 Month ago
Ramboll3 - Technical Program Manager, Mission Critical Commissioning

Ramboll3

Arlington, Virginia, United States (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Astra - Senior Infrastructure & Platform Engineer

Astra

United States (Remote)
1 Month ago
codeninja  - Solution Architect

codeninja

Lahore, Punjab, Pakistan (On-Site)
3 Months ago
Devoteam - Consultant DevOps CI / CD

Devoteam

Cesson-Sévigné, Brittany, France (On-Site)
10 Months ago
bytedance - Senior Software Engineer, Multi Cloud CDN

bytedance

San Jose, California, United States (On-Site)
3 Months ago
EMA - Solution Architect

EMA

United States (Remote)
6 Months ago
ARHS - Solution Architect NodeJS/Kafka (m/f)

ARHS

Luxembourg (On-Site)
3 Weeks ago
Coupa - Solution Architect

Coupa

Tokyo, Japan (Hybrid)
3 Months ago
London stock Exchange - Senior Cloud Engineer

London stock Exchange

St. Louis, Missouri, United States (On-Site)
3 Months ago
Turbulent - DevOps Senior

Turbulent

Montreal, Quebec, Canada (On-Site)
1 Month ago
Super.com - Software Architect

Super.com

United States (Remote)
9 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Los Angeles, California, United States (On-Site)

Novato, California, United States (Hybrid)

Novato, California, United States (Hybrid)

Novato, California, United States (Hybrid)

Montréal, Québec, Canada (Hybrid)

Novato, California, United States (On-Site)

Montreal, Quebec, Canada (Hybrid)

Los Angeles, California, United States (On-Site)

Los Angeles, California, United States (Hybrid)

Dublin, County Dublin, Ireland (Hybrid)

View All Jobs

Get notified when new jobs are added by 2K

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug