Senior Site Reliability Engineer - GCP Focussed

1 Month ago • 5 Years + • DevOps • $116,100 PA - $198,440 PA

Job Summary

Job Description

This Senior Site Reliability Engineer (SRE) role focuses on GCP, managing large-scale, data-intensive systems. Responsibilities include administering cloud infrastructure (GCS, Cloud SQL, Spanner, Firestore), supporting ML/analytics platforms (Vertex AI, BigQuery, Dataproc), implementing cloud observability (OpenTelemetry), troubleshooting Linux systems, managing GCP services (GKE, GCE), building IaC (Terraform, Ansible), and deploying services (Python, Golang, Java). The ideal candidate will have 5+ years of experience in SRE, DevOps, or Infrastructure Engineering, with a strong GCP background and experience with CI/CD pipelines. The role requires strong communication and problem-solving skills and involves on-call support.
Must have:
  • 5+ years SRE experience
  • GCP expertise
  • Cloud infrastructure management
  • ML/analytics platform support
  • IaC experience (Terraform, Ansible)
  • CI/CD pipeline implementation
  • Linux system administration
  • On-call support

Job Details

About the Role

We are seeking a highly skilled and experienced Senior Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a strong background in managing large-scale, data-intensive production-grade systems and infrastructure, with deep experience in cloud observability, automation, and reliability engineering at scale. A solid understanding of public cloud services—especially Google Cloud Platform (GCP)—is essential.

At the core of this role is the administration and maintenance of cloud infrastructure, including on-call support, monitoring, automation, deployment, the establishment of CI/CD pipelines, and the formulation of reusable cloud infrastructure templates via infrastructure as code (IaC) methodologies.

You will apply these SRE principles to design and implement scalable, automated infrastructure supporting ML model training, real-time inference APIs, and analytics workloads across platforms like Vertex AI, BigQuery, and Dataproc. You’ll work closely with ML and data teams to ensure production systems are observable, performant, and fault-tolerant — embedding reliability into every stage of the pipeline.

This role involves working in a remote environment, requiring excellent communication skills and the ability to solve complex problems independently and creatively.

Work Location: US-Remote, Canada-Remote

Key Responsibilities:

    • Administer and optimize cloud-native databases and storage platforms, including Google Cloud Storage (GCS), Cloud SQL, Spanner, and Firestore.
    • Support and maintain machine learning and analytics platforms, including Vertex AI, Generative AI, BigQuery, Looker, and Dataproc, ensuring scalable and reliable infrastructure for data pipelines and model workflows.
    • Implement and manage cloud observability using OpenTelemetry and native GCP tools to enable real-time monitoring, distributed tracing, and incident resolution.
    • Support and maintain large-scale applications, computer systems, and networks in production environments.
    • Administer and troubleshoot Linux-based systems, including core networking protocols such as TCP/IP, HTTP, MAIL protocols, DNS, and manage components like content delivery networks (CDNs) and load balancers.
    • Manage and operate GCP services, including Kubernetes Engine (GKE), Compute Engine (GCE), Networking, Security, CI/CD pipelines, and other common Cloud technologies.
    • Build and maintain cloud infrastructure using Infrastructure as Code (IaC) tools such as Terraform, Ansible, and Helm Charts.
    • Develop and deploy services using Python, Golang, or Java, and implement CI/CD pipelines to ensure consistent, reliable delivery of applications and infrastructure components.

Qualifications:

    • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
    • 5+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering, including hands-on operational support and participation in on-call rotations.
    • Proven track record of managing large-scale applications, distributed systems, and networked services in production.
The following information is required by pay transparency legislation in the following states: CA, CO, HI, NY, and WA. This information applies only to individuals working in these states.
 
·       The anticipated starting pay range for Colorado is: $116,100 - $170,280.
·       The anticipated starting pay range for the states of Hawaii and New York (not including NYC) is: $123,600 - $181,280.
·       The anticipated starting pay range for California, New York City and Washington is: $135,300 - $198,440.

Unless already included in the posted pay range and based on eligibility, the role may include variable compensation in the form of bonus, commissions, or other discretionary payments. These discretionary payments are based on company and/or individual performance and may change at any time. Actual compensation is influenced by a wide array of factors including but not limited to skill set, level of experience, licenses and certifications, and specific work location. Information on benefits  offered is here.

#LI-VM1
#LI-Remote
#LI-USA
#LI-Canada

About Rackspace Technology
We are the multicloud solutions experts. We combine our expertise with the world’s leading technologies — across applications, data and security — to deliver end-to-end solutions. We have a proven record of advising customers based on their business challenges, designing solutions that scale, building and managing those solutions, and optimizing returns into the future. Named a best place to work, year after year according to Fortune, Forbes and Glassdoor, we attract and develop world-class talent. Join us on our mission to embrace technology, empower customers and deliver the future.
 
 
More on Rackspace Technology
Though we’re all different, Rackers thrive through our connection to a central goal: to be a valued member of a winning team on an inspiring mission. We bring our whole selves to work every day. And we embrace the notion that unique perspectives fuel innovation and enable us to best serve our customers and communities around the globe. We welcome you to apply today and want you to know that we are committed to offering equal employment opportunity without regard to age, color, disability, gender reassignment or identity or expression, genetic information, marital or civil partner status, pregnancy or maternity status, military or veteran status, nationality, ethnic or national origin, race, religion or belief, sexual orientation, or any legally protected characteristic. If you have a disability or special need that requires accommodation, please let us know.
 
 

Similar Jobs

Gaming Innovation Group  - Senior Platform DevOps Engineer

Gaming Innovation Group

St. Julian's, Malta (Hybrid)
1 Month ago
Thatgamecompany - Live Ops Engineer

Thatgamecompany

United States (Remote)
1 Month ago
PwC - Senior Data Scientist

PwC

Warsaw, Masovian Voivodeship, Poland (Hybrid)
7 Months ago
Blazing griffin - Tools Programmer (Games)

Blazing griffin

Glasgow, Scotland, United Kingdom (Hybrid)
4 Months ago
Virtusa - Cloud DevOps Lead

Virtusa

Andhra Pradesh, India (On-Site)
6 Months ago
Google - Customer Engineer, Google Cloud

Google

Taipei City, Taiwan (On-Site)
2 Weeks ago
Teradata - Senior Cloud Engineer

Teradata

Pune, Maharashtra, India (On-Site)
6 Months ago
Sony Interactive Entertainment - Developer Experience Engineer (PlayStation™Network Server Platform Development)

Sony Interactive Entertainment

Tokyo, Japan (On-Site)
3 Weeks ago
Luxoft - Senior Java engineer (with oncall support)

Luxoft

Ukrainka, Kyiv Oblast, Ukraine (Remote)
4 Months ago
Glean - Solutions Architect - ANZ / Singapore region customer hours.

Glean

Bengaluru, Karnataka, India (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

PlayStation Global - Technical Program Manager - Contract

PlayStation Global

Aliso Viejo, California, United States (On-Site)
1 Month ago
Britive - SOFTWARE ENGINEER

Britive

California, United States (Remote)
5 Months ago
Next Level Business Services - IIB, DP, ODM Admin

Next Level Business Services

Burbank, California, United States (On-Site)
6 Months ago
Playrix - Principal C++ Software Engineer (Tools)

Playrix

Armenia (Remote)
6 Months ago
Onward Search - Fullstack Engineer

Onward Search

San Jose, California, United States (On-Site)
1 Month ago
NVIDIA - Senior System Software Engineer - MLOps

NVIDIA

California, United States (Hybrid)
1 Month ago
Urbint - Senior Full Stack Developer

Urbint

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Google - Customer Engineer, Gemini Code Assist, Google Cloud

Google

Maharashtra, India (On-Site)
2 Weeks ago
The Walt Disney Company - Lead Software Engineer, Machine Learning - Ad Platforms

The Walt Disney Company

Seattle, Washington, United States (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

Jobs in undefined

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

DevOps Jobs

Microsoft - Software Engineer 2 – Cloud Infrastructure Engineering

Microsoft

Hyderabad, Telangana, India (On-Site)
2 Weeks ago
Info Stretch - Java Support Software Engineer

Info Stretch

Mexico (On-Site)
6 Months ago
Quizizz - Platform Engineer

Quizizz

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Wargaming - Infrastructure Engineer

Wargaming

Nicosia, Nicosia, Cyprus (Hybrid)
2 Weeks ago
G5 Games - Monitoring Engineer

G5 Games

Astana, Astana, Kazakhstan (Remote)
1 Month ago
N-iX - Senior DevOps Engineer

N-iX

Argentina (Remote)
1 Month ago
ByteDance - Site Reliability Engineer, Traffic Platform - 2025 Start

ByteDance

Singapore (On-Site)
6 Months ago
Argus Labs - Site Reliability Engineer

Argus Labs

Calgary, Alberta, Canada (Remote)
1 Month ago
The Walt Disney Company - Sr Cloud FinOps Analyst

The Walt Disney Company

Seattle, Washington, United States (On-Site)
2 Weeks ago
Epic Games - Build Programmer, Fortnite

Epic Games

Vancouver, British Columbia, Canada (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded