Senior Staff Software Engineer, Site Reliability Engineering

19 Hours ago • 8-15 Years • DevOps • Backend Development

Job Summary

Job Description

This Senior Staff Software Engineer, Site Reliability Engineering (SRE) role at Google involves designing, analyzing, and troubleshooting large-scale distributed systems. Responsibilities encompass the entire service lifecycle, from inception and design to deployment, operation, and refinement. SREs ensure high reliability, uptime, and performance of Google Cloud services, both internal and external. The work includes system design consulting, software platform development, capacity planning, monitoring system health, automating processes for scalability, and conducting incident response and postmortems. The ideal candidate possesses strong software development skills (8+ years), project leadership experience (4+ years), and expertise in distributed systems (3+ years).
Must have:
  • 8+ years software development experience
  • 4+ years project leadership
  • 3+ years experience with distributed systems
  • Expertise in system design, analysis, and troubleshooting
  • Automation and optimization skills
Good to have:
  • Experience in computing, distributed systems, storage, or networking
  • Expertise in large-scale distributed systems

Job Details

Minimum qualifications:

  • Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
  • 8 years of experience with software development in one or more programming languages.
  • 4 years of experience leading projects, and providing technical leadership
  • 3 years of experience in designing, analyzing, and troubleshooting distributed systems.

Preferred qualifications:

  • Experience working in computing, distributed systems, storage, or networking.
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.
  • Ability to debug, optimize code, and to automate routine tasks.
  • Systematic problem-solving approach, coupled with effective verbal and written communication skills.

About the job

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SRE’s will keep an ever-watchful eye on our systems capacity and performance.


Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.

Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We keep our networks up and running, ensuring our users have the best and fastest experience possible.

Responsibilities

  • Engage in and improve the whole lifecycle of services—from inception and design, through to deployment, operation and refinement.
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
  • Practice sustainable incident response and blameless postmortems.

Similar Jobs

Google - Software Engineer III, Infrastructure, Conduit Flume Pipelines

Google

Zürich, Zurich, Switzerland (On-Site)
1 Week ago
Google - Software Engineer II, Full Stack, Google Ads

Google

London, England, United Kingdom (On-Site)
1 Week ago
Google - Product Manager, Activity and Sleep Coaching

Google

Mountain View, California, United States (On-Site)
1 Week ago
Match Group - Sr. Software Engineer, Machine Learning

Match Group

Palo Alto, California, United States (Hybrid)
6 Months ago
Google - Software Engineer, Black Community Inclusion

Google

State Of Minas Gerais, Brazil (On-Site)
4 Months ago
Scorewarrior - Senior System Engineer

Scorewarrior

Limassol, Limassol, Cyprus (On-Site)
1 Month ago
Google - Customer Engineer, Data Analytics, ISV, Google Cloud

Google

San Francisco, California, United States (On-Site)
21 Hours ago
ION - Cloud Engineer Kubernetes

ION

Italy (Hybrid)
6 Months ago
Warner Bros Games - Staff Software Engineer - Cloud Support and Operations

Warner Bros Games

Bengaluru, Karnataka, India (Hybrid)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Wargaming - Game Developer

Wargaming

Belgrade, Serbia (Hybrid)
1 Month ago
NVIDIA - Senior Computer Architect - Deep Learning

NVIDIA

Canada (On-Site)
2 Months ago
ByteDance - Video Coding/Transcoding Algorithm Engineer

ByteDance

San Jose, California, United States (On-Site)
5 Months ago
Google - Software Engineer III, AI/ML, Google Cloud Application Modernization

Google

Sunnyvale, California, United States (On-Site)
1 Week ago
Google - Software Engineer III, Chrome

Google

Mexico City, Mexico City, Mexico (On-Site)
1 Week ago
Google - Senior Software Developer, Infrastructure, Google Cloud Security and Privacy

Google

Waterloo, Ontario, Canada (On-Site)
1 Week ago
NVIDIA - Manager, Developer Technology, Data Compression

NVIDIA

California, United States (Hybrid)
3 Months ago
Wargaming - Game Developer

Wargaming

Vilnius, Vilnius County, Lithuania (On-Site)
1 Month ago
Google - Software Engineer III, Security/Privacy, Core

Google

Munich, Bavaria, Germany (On-Site)
1 Week ago
Easy Brain - Senior Unity Developer

Easy Brain

Limassol, Limassol, Cyprus (Hybrid)
7 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Zürich, Zurich, Switzerland

Google - Product Manager II, Sustainability

Google

Zürich, Zurich, Switzerland (On-Site)
1 Week ago
PwC - Senior Manager Actuarial Services

PwC

Zürich, Zurich, Switzerland (On-Site)
7 Months ago
PwC - Data Engineering - GTM Director

PwC

Zürich, Zurich, Switzerland (On-Site)
7 Months ago
PwC - Auditor - Treasury and Commodity Trading

PwC

Geneva, Geneva, Switzerland (On-Site)
7 Months ago
Google - Software Engineer II, Full Stack, YouTube

Google

Zürich, Zurich, Switzerland (On-Site)
1 Week ago
Google - Senior Product Manager, Anti Money Laundering AI

Google

Zürich, Zurich, Switzerland (On-Site)
1 Week ago
Sonar Source - Support Engineer

Sonar Source

Geneva, Geneva, Switzerland (On-Site)
6 Months ago
Tesla - HR Partner

Tesla

Zug, Zug, Switzerland (On-Site)
2 Months ago
PwC - Senior Manager in Insurance Consulting

PwC

Zürich, Zurich, Switzerland (On-Site)
7 Months ago
Fluence - Quality Assurance Manager

Fluence

Zürich, Zurich, Switzerland (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Aera Technology - Senior Platform Administration Engineer

Aera Technology

Bucharest, Bucharest, Romania (Hybrid)
6 Months ago
Epic Games - Senior DevOps Programmer

Epic Games

Montreal, Quebec, Canada (On-Site)
1 Month ago
Google - Senior Engineering Manager, Google Distributed Cloud

Google

Sunnyvale, California, United States (On-Site)
1 Week ago
Zazz - Data Engineer

Zazz

(Remote)
3 Months ago
The Walt Disney Company - Senior Software Engineer

The Walt Disney Company

England, United Kingdom (On-Site)
2 Weeks ago
DEVOTEAM - Data Driven | MLOps Engineer

DEVOTEAM

Lisbon, Lisbon, Portugal (Remote)
6 Months ago
Larian Studios - Senior Automation Engineer

Larian Studios

Warsaw, Masovian Voivodeship, Poland (On-Site)
1 Month ago
Google - Customer Engineer, SAP, Google Cloud

Google

Kansas City, Missouri, United States (On-Site)
1 Week ago
Milestone - Implementation Engineer

Milestone

United States (Remote)
1 Week ago
Garena - Garena - Operation Engineer (Game System Operations Engineer)

Garena

Taipei City, Taiwan (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

A problem isn't truly solved until it's solved for all. Googlers build products that help create opportunities for everyone, whether down the street or across the globe. Bring your insight, imagination and a healthy disregard for the impossible. Bring everything that makes you unique. Together, we can build for everyone.

Dublin, County Dublin, Ireland (On-Site)

New York, New York, United States (On-Site)

Waterloo, Ontario, Canada (On-Site)

Taipei City, Taiwan (On-Site)

San Francisco, California, United States (On-Site)

Saint-Ghislain, Wallonia, Belgium (On-Site)

Bengaluru, Karnataka, India (On-Site)

Austin, Texas, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Google

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug