Staff Site Reliability Engineer

1 Month ago • 8 Years + • Devops

Job Summary

Job Description

Aerospike is the real-time database for mission-critical use cases and workloads, including machine learning, generative, and agentic AI. As a Staff Site Reliability Engineer, you’ll be a technical leader within our global SRE organization, helping drive reliability, performance, and scalability across our hybrid and multi-cloud environments. You’ll bring deep operational experience and lead by example—mentoring others, designing resilient systems, and championing modern SRE practices across new and legacy platforms. You’ll play a key role in shaping the direction of our infrastructure initiatives, from Kubernetes-based platforms like AKS and the Aerospike Kubernetes Operator to existing services in AWS and GCP. Your impact will span teams and systems as you solve complex problems, influence architecture, and foster a culture of ownership, resilience, and continuous improvement.
Must have:
  • Provide technical leadership across multiple systems and environments, proactively identifying risks, shaping architecture decisions, and improving reliability and performance at scale.
  • Lead key infrastructure efforts including Kubernetes platform expansion (AKS, AKO), and application of SRE principles to legacy systems and new cloud offerings.
  • Define, measure, and enforce reliability standards through SLIs/SLOs, observability tooling, and incident response frameworks.
  • Mentor and guide other SREs by leading design sessions, conducting technical deep dives, and reviewing code, configurations, and infrastructure decisions.
  • Partner with product, engineering, and cloud teams to align reliability goals with delivery objectives.
  • Lead root cause analyses and implement systemic fixes for issues spanning multiple platforms or services.
  • Drive automation-first approaches using IaC, CI/CD pipelines, and scripting to reduce toil and increase deployment confidence.
  • Influence cross-functional roadmaps, identifying areas for innovation, technical debt reduction, and long-term scalability.
  • Participate in the global on-call rotation, bringing senior-level calm and clarity during incidents and escalations.
Good to have:
  • Hands-on experience managing and optimizing database deployments and services in production environments, ensuring high availability and performance.
  • Familiarity with Aerospike or other distributed databases is a plus.
  • Kubernetes or cloud certifications (CKA, CKS, AWS/GCP DevOps/Architect) a plus but not require
  • Track record of influencing architectural decisions across teams or domains.

Job Details

Aerospike is the real-time database for mission-critical use cases and workloads, including machine learning, generative, and agentic AI. Aerospike powers millions of transactions per second with millisecond latency, at a fraction of the total cost of ownership compared to other databases.

Global leaders, including Adobe, Airtel, Barclays, Criteo, DBS Bank, Experian, Grab, HDFC Bank, PayPal, Sony Interactive Entertainment, The Trade Desk, and Wayfair, rely on Aerospike for customer 360, fraud detection, real-time bidding, profile stores, recommendation engines, and other use cases.

At Aerospike, we dream big and deliver even bigger. Our mission is to unleash the power of the world’s real-time data with a database built for infinite scale, speed, and sustainability.

If you're ready to shape the future of data, join us.

As a Staff Site Reliability Engineer at Aerospike, you’ll be a technical leader within our global SRE organization, helping drive reliability, performance, and scalability across our hybrid and multi-cloud environments. You’ll bring deep operational experience and lead by example—mentoring others, designing resilient systems, and championing modern SRE practices across new and legacy platforms.

You’ll play a key role in shaping the direction of our infrastructure initiatives, from Kubernetes-based platforms like AKS and the Aerospike Kubernetes Operator to existing services in AWS and GCP. Your impact will span teams and systems as you solve complex problems, influence architecture, and foster a culture of ownership, resilience, and continuous improvement.

Key Responsibilities

  • Provide technical leadership across multiple systems and environments, proactively identifying risks, shaping architecture decisions, and improving reliability and performance at scale.
  • Lead key infrastructure efforts including Kubernetes platform expansion (AKS, AKO), and application of SRE principles to legacy systems and new cloud offerings.
  • Define, measure, and enforce reliability standards through SLIs/SLOs, observability tooling, and incident response frameworks.
  • Mentor and guide other SREs by leading design sessions, conducting technical deep dives, and reviewing code, configurations, and infrastructure decisions.
  • Partner with product, engineering, and cloud teams to align reliability goals with delivery objectives.
  • Lead root cause analyses and implement systemic fixes for issues spanning multiple platforms or services.
  • Drive automation-first approaches using IaC, CI/CD pipelines, and scripting to reduce toil and increase deployment confidence.
  • Influence cross-functional roadmaps, identifying areas for innovation, technical debt reduction, and long-term scalability.
  • Participate in the global on-call rotation, bringing senior-level calm and clarity during incidents and escalations.

Required Experience

  • 8+ years of experience in SRE, DevOps, or infrastructure engineering, including significant time operating production systems at scale.
  • Deep hands-on experience with at least one major public cloud (AWS, GCP, Azure), and working knowledge of the others; Azure experience is a plus.
  • Production experience with Kubernetes, including operating clusters, Helm, operators, and supporting microservices in real-world environments.
  • Strong proficiency in infrastructure-as-code tools such as Terraform and CI/CD automation platforms.
  • Expertise in observability tools and practices (Datadog, Prometheus, Grafana, ELK, etc.) and using them to define SLIs and SLOs.; DataDog experience is a plus
  • Programming and scripting ability in one or more languages (Python, Go, Bash, etc.).
  • Experience with large-scale incident response and post-incident review practices.
  • Proven ability to mentor other engineers and influence technical strategy across multiple teams.
  • Strong communication skills to articulate complex concepts to technical and non-technical stakeholders.

Preferred Skills and Qualifications

  • Hands-on experience managing and optimizing database deployments and services in production environments, ensuring high availability and performance.
  • Familiarity with Aerospike or other distributed databases is a plus.
  • Kubernetes or cloud certifications (CKA, CKS, AWS/GCP DevOps/Architect) a plus but not require
  • Track record of influencing architectural decisions across teams or domains.

Aerospike is an Equal Opportunity Employer. We are committed to providing an environment free from discrimination on the basis of race, religion, color, sex, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status, or any other basis covered by appropriate law.

Similar Jobs

Single Store - Technical Account Manager

Single Store

Pune, Maharashtra, India (Remote)
5 Months ago
Workato - Business Technology Support Analyst

Workato

Muntinlupa, Metro Manila, Philippines (On-Site)
1 Month ago
PwC - AI/ML Azure Engineer (m/f/d)

PwC

Luxembourg (On-Site)
10 Months ago
binance - Pioneer Talent Program - Java Engineer (Tech Compliance)

binance

Taipei City, Taiwan (Remote)
2 Months ago
Kluge Interactive - Marketing Director

Kluge Interactive

(Remote)
1 Month ago
Nice - Senior Specialist Software Engineer (Java, AWS)

Nice

Pune, Maharashtra, India (Hybrid)
1 Month ago
Salesforce - Distinguished/Principal Solution Engineer - Communications and Media

Salesforce

Gurugram, Haryana, India (On-Site)
1 Month ago
Saviynt - Senior Software Engineer, Devops

Saviynt

Bengaluru, Karnataka, India (Hybrid)
1 Month ago
C3 IoT - Site Reliability Engineer

C3 IoT

London, England, United Kingdom (On-Site)
1 Month ago
CData Software - Platform Engineer

CData Software

Bengaluru, Karnataka, India (On-Site)
11 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

BYD EUROPE - Product localization specialist (Austria)

BYD EUROPE

Vienna, Vienna, Austria (On-Site)
1 Year ago
Saronic Technologies - Mission Operations Specialist

Saronic Technologies

Sydney, New South Wales, Australia (On-Site)
1 Month ago
Ambient.ai - Regional Sales Manager - SLED

Ambient.ai

New York, New York, United States (On-Site)
1 Month ago
Illumina - Senior Accounting Analyst II - LATAM Accounting

Illumina

State Of São Paulo, Brazil (Hybrid)
2 Months ago
Kano studios - Mobile Game Backend Developer

Kano studios

Canada (Remote)
3 Months ago
Philips - Assistant Manager, Product Research

Philips

Shenzhen, Guangdong Province, China (On-Site)
1 Year ago
Nagarro - Senior Staff Engineer - SAP FICO S/4Hana Solution Advisor

Nagarro

United States (Remote)
10 Months ago
Qualcomm - CPU Physical Design Engineer

Qualcomm

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Demandbase - Software Engineer II (Front-End)

Demandbase

San Francisco, California, United States (Remote)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Australia

foxi ventures - Senior Game Designer

foxi ventures

Adelaide, South Australia, Australia (On-Site)
3 Months ago
Dentsu - Performance Executive

Dentsu

Sydney, New South Wales, Australia (Hybrid)
2 Months ago
Sonar Source - Enterprise Account Executive

Sonar Source

Sydney, New South Wales, Australia (On-Site)
1 Year ago
Triple dot studios - Associate Product Manager

Triple dot studios

Melbourne, Victoria, Australia (Hybrid)
1 Month ago
Eventbrite - Mid-Market Business Development Manager

Eventbrite

Australia (Remote)
1 Month ago
extreme network - Senior Systems Engineer

extreme network

Melbourne, Victoria, Australia (On-Site)
1 Month ago
Snyk - Senior Sales Development Representative (Japanese Speaking)

Snyk

Sydney, New South Wales, Australia (On-Site)
1 Month ago
Lorikeet - Sales Operations

Lorikeet

Sydney, New South Wales, Australia (On-Site)
1 Month ago
Varonis  - Senior Security Analyst (Japanese Speaker)

Varonis

Melbourne, Victoria, Australia (On-Site)
1 Month ago
Immutable - Senior Software Engineer (Passport)

Immutable

Australia (Hybrid)
8 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Zuora - Solution Architect, Revenue Accounting

Zuora

Atlanta, Georgia, United States (Hybrid)
3 Months ago
Sun Studio - Senior Backend and DevOps Engineer

Sun Studio

Ho Chi Minh City, Ho Chi Minh City, Vietnam (On-Site)
5 Months ago
Wind River - Member of Technical Staff - DevOps - Cloud

Wind River

Walnut Creek, California, United States (On-Site)
1 Month ago
Marvell - Automation Developer Engineer (Python, C)

Marvell

Bucharest, Romania (On-Site)
1 Month ago
Cadence - IC Physical Design Flow, Principal Solutions Engineer - AE

Cadence

San Jose, California, United States (On-Site)
1 Year ago
bytedance - Backend Software Engineer, Enterprise Solution

bytedance

San Jose, California, United States (On-Site)
5 Months ago
Motorola solutions - Senior DevOps Engineer

Motorola solutions

Ho Chi Minh City, Ho Chi Minh City, Vietnam (On-Site)
1 Month ago
Zazz - Cloud Engineer (AWS)

Zazz

(Remote)
6 Months ago
Mistral AI - AI Solution Architect

Mistral AI

Singapore (Hybrid)
3 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Headquartered in Mountain View, California, Aerospike also has a global presence with offices in London, Bangalore, and Tel Aviv. Aerospike does not accept resumes from staffing agencies with which we do not have a written agreement and specific engagement for a particular opening. Our employment activities, inquiries, and offers are managed through our HR/Talent department, and all candidates are presented through this channel only. We do not accept unsolicited resumes.

United States (Remote)

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

United States (Remote)

Mountain View, California, United States (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

View All Jobs

Get notified when new jobs are added by AeroSpike

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug