Site Reliability Engineer (SRE)

1 Month ago • 3 Years + • Devops

Job Summary

Job Description

Drivemode is seeking an experienced Site Reliability Engineer (SRE) to manage the reliability, performance, and daily operations of their Kotlin/Swift mobile applications and Kotlin backend services on AWS. You will collaborate with product and platform engineers to establish SLIs/SLOs, automate operations, lead incident response, and promote a "code-driven reliability" culture. This role involves a production-support model where SREs and feature teams share Level 2/3 support, with SREs providing the necessary tools, coaching, and leadership for developers to excel in on-call responsibilities. You will have green-field influence in defining SRE culture, tooling, and error-budget policies, with a clear career path to Staff SRE or Reliability Lead as the company scales.
Must have:
  • 3+ years in SRE, DevOps, or backend engineering
  • Proficient in Kotlin/Java, Rust, Go, or Python
  • Linux & networking fundamentals
  • Hands-on AWS experience
  • Production experience with Datadog or similar
  • Incident response expertise
  • Relational DB and Redis operations knowledge
  • Excellent communication skills
Good to have:
  • AWS, CKA certifications
  • Feature-flag systems experience
  • Chaos-engineering tools experience
  • Automotive or fintech industry experience
Perks:
  • Competitive salary
  • Flexible remote policy
  • Allowance for certifications, conferences, and home lab gear

Job Details

Our Mission:
Driving technology always feels old. Not by a little bit. We believe vehicles can be a thousand times smarter, safer, and more connected to the world around us, and our mission is to see it happen. In 2019, we joined forces with Honda as their first startup acquisition, and now we’re expanding our vision into building the future of electric vehicles (BEV) for millions of people around the world.

Why Drivemode: 
Join Drivemode for an exciting startup environment and a vibrant culture that combines impactful work, competitive compensation, and excellent benefits. By becoming a part of our team, you'll contribute to a crucial mission that revolutionizes the way people engage with vehicles, addressing both business needs and the world's environmental challenges. This presents an exceptional opportunity to be at the forefront of innovation and drive Honda's success in the EV market.

About the Role:
We’re seeking an experienced Site Reliability Engineer to own the reliability, performance, and day-to-day operations of our Kotlin/Swift mobile applications and Kotlin backend services on AWS. You will partner with product engineers and platform engineers to design SLIs/SLOs, automate operations, lead incident response, and drive a “code-driven reliability” culture across time zones.
You will be part of a production-support model where: Level 2 / Level 3 are shared by SREs and feature teams. SREs provide the tooling, coaching, and leadership that make developers excellent on call.

Why Join?
Green-field influence: define SRE culture, tooling, and error-budget policy from day one.
Career trajectory: opportunity to grow into Staff SRE / Reliability Lead as we scale to multiple regions and product lines.
Impact at scale: your work spans globally across multiple regions and product lines.
Engineering-driven org: close collaboration with product, platform, and security teams who value operational excellence.
Competitive salary, flexible remote policy, and an allowance for certifications, conferences, and home lab gear.

What You Will Do:
  • Service Reliability: Define and track SLIs/SLOs & error budgets for backend APIs and mobile release health. Hold teams accountable to reliability goals.
  • Incident Management: Lead the on-call rotations, coordinate incident response, run post-mortems, and eradicate root causes.
  • Observability & Tooling: Own Datadog dashboards, log pipelines, crash analytics (Firebase / Sentry), and feature-flag metrics (LaunchDarkly / ConfigCat).
  • Automation & Elimination of Toil: Write tools and self-healing runbooks in Kotlin, Rust, Go, or Python for rollbacks, DB failovers, chaos tests, and config drift detection.
  • Capacity & Performance: Forecast load, run stress / load tests, tune JVM & Graal settings for Kotlin services, and advise on RDS & Redis scaling.
  • Disaster Recovery & Chaos Engineering: Design BCP/DR playbooks; run game days to validate recovery objectives.
  • Cost & FinOps: Instrument cost metrics and collaborate with Finance to keep AWS spend within agreed “cost budgets.”
  • Security & Compliance Support: Monitor GuardDuty / CSPM alerts, be prepared and participate in security incident response.
  • Developer Partnership: Pair with mobile & backend engineers on instrumentation, release gates, and staged roll-outs; mentor teams in SLO thinking via brown-bag sessions.

What We Are Looking For:
  • 3 + years in SRE, DevOps, or backend engineering for high-traffic services
  • Proficient in at least one of Kotlin / Java, Rust, Go, or Python
  • Deep Linux & networking fundamentals and hands-on AWS (ECS, ALB/NLB, RDS, S3, IAM, CloudWatch)
  • Production experience with Datadog (or Prometheus / OpenTelemetry) for metrics, traces, and logs
  • Incident response expertise: runbooks, RCA, post-mortems, and blameless culture
  • Practical knowledge of relational DB (PostgreSQL/RDS) and Redis operations
  • Familiarity with Kubernetes (EKS) concepts, Helm/OPA, container networking, and rolling releases
  • Excellent communication skills; able to coach developers and influence process improvements.

Nice to have:
  • AWS, CKA certifications
  • Experience with feature-flag systems, chaos-engineering tools
  • Prior work in regulated or enterprise-integrated environments (e.g., automotive, fintech)

EEOC Statement: Drivemode is proud of a very diverse team with employees coming from 5 continents/20 countries as of today. Diversity in our workplace has played an important part in our success; we recognize each employee’s unique background, knowledge, experiences, ideas, and viewpoints which are all critical in developing a product that has the greatest impacts on drivers all over the world. Drivemode provides equal opportunities to all employees and applicants for employment without regard to race, religion, color, age, gender, national origin, sexual orientation, gender identity, disability, or any other characteristics that make you unique. 

Similar Jobs

Tide - Lead Product Manager - Card Payments

Tide

Sofia, Sofia City Province, Bulgaria (Hybrid)
1 Month ago
Opendoor - Tax Manager

Opendoor

San Francisco, California, United States (Hybrid)
2 Weeks ago
Tencent - Patent Litigation Attorney

Tencent

Shenzhen, Guangdong Province, China (On-Site)
7 Months ago
Wolters Kluwer - Sr. IT Security Analyst (IAM Operations + Cyberark)

Wolters Kluwer

Pune, Maharashtra, India (On-Site)
1 Month ago
CGS Carrers - Technical Support Analyst I

CGS Carrers

Bogota, Colombia (Remote)
2 Months ago
bytedance - Site Reliability Engineer, Traffic Platform - 2025 Start

bytedance

Singapore (On-Site)
8 Months ago
bytedance - Research Engineer Graduate (Vision AI Platform)

bytedance

San Jose, California, United States (On-Site)
3 Months ago
EMA - Solution Architect

EMA

United States (Remote)
5 Months ago
bytedance - Senior Software Engineer, Backend and Infrastructure

bytedance

Seattle, Washington, United States (On-Site)
3 Months ago
EMA - DevOps Engineering Lead

EMA

California, United States (Hybrid)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Artists Animation - Production Coordinator

Artists Animation

British Columbia, Canada (Hybrid)
3 Months ago
Qualcomm - WIFI PHY - Design Verification -Sr Lead Engineer

Qualcomm

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Capco - Allegro Developer

Capco

Bengaluru, Karnataka, India (On-Site)
3 Weeks ago
Stibo Systems - Director, Head of Data, BI & Applications

Stibo Systems

Aarhus, Denmark (Hybrid)
1 Month ago
Toast - Workday Solutions Architect

Toast

United States (On-Site)
3 Months ago
Varonis  - R&D Escalation Engineer

Varonis

Cork, County Cork, Ireland (On-Site)
5 Years ago
Activate Games - Team Lead (Store Supervisor)

Activate Games

Dartmouth, Nova Scotia, Canada (On-Site)
3 Weeks ago
Motorola solutions - Product Manager

Motorola solutions

Ware, England, United Kingdom (On-Site)
2 Months ago
Ion - Senior Technical Consultant

Ion

Noida, Uttar Pradesh, India (On-Site)
1 Year ago
EMA - Solution Architect

EMA

United States (Remote)
5 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Tokyo, Japan

Cygames - Illustrator / Single Illustration / Tokyo

Cygames

Tokyo, Japan (On-Site)
2 Months ago
Game freak - Project Manager [Junior Class]

Game freak

Chiyoda City, Tokyo, Japan (On-Site)
2 Months ago
Valeo - VES Payroll Delivery Manager

Valeo

Tokyo, Japan (On-Site)
4 Months ago
Kojima - Technical Artist

Kojima

Minato City, Tokyo, Japan (On-Site)
6 Months ago
limit break - Lead Engineer (Unity) (Japan)

limit break

Tokyo, Japan (On-Site)
1 Year ago
Social Discovery Ventures - Chief Executive Officer

Social Discovery Ventures

Tokyo, Japan (Hybrid)
3 Weeks ago
The Walt Disney Company - Assistant Manager, Franchise Marketing

The Walt Disney Company

Minato City, Tokyo, Japan (On-Site)
4 Months ago
London stock Exchange - Customer Success Manager, Risk

London stock Exchange

Tokyo, Japan (Hybrid)
2 Months ago
The Walt Disney Company - Assistant Store Manager

The Walt Disney Company

Minato City, Tokyo, Japan (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Token Metrics - DevOps/Site Reliability Engineer (Remote)

Token Metrics

Bengaluru, Karnataka, India (Remote)
9 Months ago
Trend Micro - (Sr.) Cloud Backend Engineer

Trend Micro

Taipei City, Taiwan (On-Site)
10 Months ago
Capgemini - SAP End to End Solution Architect

Capgemini

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Virtusa - DevOps Lead

Virtusa

Pune, Maharashtra, India (Hybrid)
8 Months ago
Rackspace Technology - AWS Cloud Architect

Rackspace Technology

Gurugram, Haryana, India (Remote)
1 Month ago
Google - Senior Software Engineer, Google Cloud Compute Infrastructure

Google

Kirkland, Washington, United States (On-Site)
8 Months ago
Rackspace Technology - Machine Learning Architect (AWS)

Rackspace Technology

(Remote)
6 Months ago
Applike - DevOps Engineer

Applike

Hamburg, Hamburg, Germany (Hybrid)
2 Years ago
King - Senior DevOps Engineer

King

San Francisco, California, United States (Hybrid)
1 Year ago
hogarth - AI Solutions Architect

hogarth

London, England, United Kingdom (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Mountain View, California, United States (Hybrid)

Mountain View, California, United States (Hybrid)

Mountain View, California, United States (Hybrid)

Mountain View, California, United States (Hybrid)

Tokyo, Japan (Hybrid)

Mountain View, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by Drive mode

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug