Jobs Courses Resources Companies Placements

Home >

Jobs >

Site Reliability Engineer (SRE)

Drive Mode

Tokyo, Japan (Hybrid)

Site Reliability Engineer (SRE)

1 Day ago • 3 Years + • Devops

Job Summary

Job Description

Drivemode is seeking an experienced Site Reliability Engineer (SRE) to manage the reliability, performance, and daily operations of their Kotlin/Swift mobile applications and Kotlin backend services on AWS. You will collaborate with product and platform engineers to establish SLIs/SLOs, automate operations, lead incident response, and promote a "code-driven reliability" culture. This role involves a production-support model where SREs and feature teams share Level 2/3 support, with SREs providing the necessary tools, coaching, and leadership for developers to excel in on-call responsibilities. You will have green-field influence in defining SRE culture, tooling, and error-budget policies, with a clear career path to Staff SRE or Reliability Lead as the company scales.

Must have:

3+ years in SRE, DevOps, or backend engineering
Proficient in Kotlin/Java, Rust, Go, or Python
Linux & networking fundamentals
Hands-on AWS experience
Production experience with Datadog or similar
Incident response expertise
Relational DB and Redis operations knowledge
Excellent communication skills

Good to have:

AWS, CKA certifications
Feature-flag systems experience
Chaos-engineering tools experience
Automotive or fintech industry experience

Perks:

Competitive salary
Flexible remote policy
Allowance for certifications, conferences, and home lab gear

17 skills required

17 skills required for this role

Add these skills to join the top 1% applicants for this job

communication

budget-management

postgresql

networking

linux

incident-response

aws

rust

prometheus

helm

redis

kubernetes

kotlin

python

firebase

swift

java

Job Details

Our Mission:

Driving technology always feels old. Not by a little bit. We believe vehicles can be a thousand times smarter, safer, and more connected to the world around us, and our mission is to see it happen. In 2019, we joined forces with Honda as their first startup acquisition, and now we’re expanding our vision into building the future of electric vehicles (BEV) for millions of people around the world.

Why Drivemode:

Join Drivemode for an exciting startup environment and a vibrant culture that combines impactful work, competitive compensation, and excellent benefits. By becoming a part of our team, you'll contribute to a crucial mission that revolutionizes the way people engage with vehicles, addressing both business needs and the world's environmental challenges. This presents an exceptional opportunity to be at the forefront of innovation and drive Honda's success in the EV market.

About the Role:

We’re seeking an experienced Site Reliability Engineer to own the reliability, performance, and day-to-day operations of our Kotlin/Swift mobile applications and Kotlin backend services on AWS. You will partner with product engineers and platform engineers to design SLIs/SLOs, automate operations, lead incident response, and drive a “code-driven reliability” culture across time zones.

You will be part of a production-support model where: Level 2 / Level 3 are shared by SREs and feature teams. SREs provide the tooling, coaching, and leadership that make developers excellent on call.

Why Join?

Green-field influence: define SRE culture, tooling, and error-budget policy from day one.

Career trajectory: opportunity to grow into Staff SRE / Reliability Lead as we scale to multiple regions and product lines.

Impact at scale: your work spans globally across multiple regions and product lines.

Engineering-driven org: close collaboration with product, platform, and security teams who value operational excellence.

Competitive salary, flexible remote policy, and an allowance for certifications, conferences, and home lab gear.

What You Will Do:

Service Reliability: Define and track SLIs/SLOs & error budgets for backend APIs and mobile release health. Hold teams accountable to reliability goals.
Incident Management: Lead the on-call rotations, coordinate incident response, run post-mortems, and eradicate root causes.
Observability & Tooling: Own Datadog dashboards, log pipelines, crash analytics (Firebase / Sentry), and feature-flag metrics (LaunchDarkly / ConfigCat).
Automation & Elimination of Toil: Write tools and self-healing runbooks in Kotlin, Rust, Go, or Python for rollbacks, DB failovers, chaos tests, and config drift detection.
Capacity & Performance: Forecast load, run stress / load tests, tune JVM & Graal settings for Kotlin services, and advise on RDS & Redis scaling.
Disaster Recovery & Chaos Engineering: Design BCP/DR playbooks; run game days to validate recovery objectives.
Cost & FinOps: Instrument cost metrics and collaborate with Finance to keep AWS spend within agreed “cost budgets.”
Security & Compliance Support: Monitor GuardDuty / CSPM alerts, be prepared and participate in security incident response.
Developer Partnership: Pair with mobile & backend engineers on instrumentation, release gates, and staged roll-outs; mentor teams in SLO thinking via brown-bag sessions.

What We Are Looking For:

3 + years in SRE, DevOps, or backend engineering for high-traffic services
Proficient in at least one of Kotlin / Java, Rust, Go, or Python
Deep Linux & networking fundamentals and hands-on AWS (ECS, ALB/NLB, RDS, S3, IAM, CloudWatch)
Production experience with Datadog (or Prometheus / OpenTelemetry) for metrics, traces, and logs
Incident response expertise: runbooks, RCA, post-mortems, and blameless culture
Practical knowledge of relational DB (PostgreSQL/RDS) and Redis operations
Familiarity with Kubernetes (EKS) concepts, Helm/OPA, container networking, and rolling releases
Excellent communication skills; able to coach developers and influence process improvements.

Nice to have:

AWS, CKA certifications
Experience with feature-flag systems, chaos-engineering tools
Prior work in regulated or enterprise-integrated environments (e.g., automotive, fintech)

EEOC Statement: Drivemode is proud of a very diverse team with employees coming from 5 continents/20 countries as of today. Diversity in our workplace has played an important part in our success; we recognize each employee’s unique background, knowledge, experiences, ideas, and viewpoints which are all critical in developing a product that has the greatest impacts on drivers all over the world. Drivemode provides equal opportunities to all employees and applicants for employment without regard to race, religion, color, age, gender, national origin, sexual orientation, gender identity, disability, or any other characteristics that make you unique.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Tokyo, Japan

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Devops Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Drive mode

24 Active Jobs

Get notified when new jobs are added by Drive mode

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

A global community of game builders. Helping people upskill and land jobs in the best gaming studios.

Company

Key Links

hello@outscal.com

Made in INDIA 💛💙

Site Reliability Engineer (SRE)

Job Summary

Job Description

17 skills required

17 skills required for this role

Job Details

Similar Jobs

Looks like we're out of matches

Similar Skill Jobs

Looks like we're out of matches

Jobs in Tokyo, Japan

Looks like we're out of matches

Devops Jobs

Looks like we're out of matches

About The Company

Infrastructure Platform Engineer

Site Reliability Engineer (SRE)

Senior Product Manager (Backend)

Product Manager

Staff Software Engineer - IVI

People Operations Partner

Office Coordinator

Software Development Engineer in Test

Senior Product Designer

Senior DevOps Engineer

Level Up Your Career in Game Development!