Production Engineer

2 Months ago • All levels • Product Management

Job Summary

Job Description

The Core Backend group provides low-latency, high-scale services for all product teams. The Platform Team ensures these core systems are healthy and extends their capabilities, enabling internal business units to ship features rapidly and safely. The mission is to make uptime, reliability, and team velocity natural outcomes of great engineering, not late-night efforts. Responsibilities include designing and owning platform services/APIs, managing platform reliability and accessibility, supporting Elixir/Cassandra systems, enhancing CI/CD pipelines, ensuring end-to-end observability, and automating site-reliability workflows.
Must have:
  • Distributed-systems engineering in Elixir/Erlang (or Go/Python) with reliability patterns
  • Cloud-native infrastructure & automation (Kubernetes on GCP, Buildkite CI/CD, IaC)
  • Observability & SRE tooling for proactive detection and remediation
  • Cross-team collaboration & communication skills
  • Developer-experience mindset with empathetic API/SDK design and clear documentation
Perks:
  • Competitive salary and stock options
  • Comprehensive health, dental, and vision insurance
  • 401(k)
  • Flexible working hours and remote-first culture
  • Clear paths for career growth and leadership

Job Details

Remote-first | Core Platform & Reliability


Why this role exists

Our Core Backend group provides the low-latency, high-scale services that every product team builds on. The Platform Team is the bridge: we keep those core systems healthy and extend their capabilities so that internal business units can ship features rapidly and safely. Your mission is to make uptime, reliability, and team velocity the natural by-products of great engineering—not late-night heroics.

Stack-Ranked Responsibilities (1 = Most Important)

  1. Design & own platform services/APIs that expose core backend functionality to product teams and unlock new business flows through Retool.

  2. Own platform reliability / accessibility and support core engineering on existing Elixir / Cassandra systems — be on rotation and first to respond to production issues. Partner with the core engineering team to build longer term resilient systems.

  3. Support and enhance CI/CD pipelines (Buildkite, IaC) so the Core team — and every business unit — can ship safely and quickly.

  4. Ensure end-to-end observability by partnering with core backend and applications on metrics, traces, and alerts, and adding instrumentation where gaps exist.

  5. Automate site-reliability workflows (issue triage, cluster upgrades, schema migrations) while collaborating with each team on their specific operational processes.

Stack-Ranked Required Skills (3 technical | 2 cross-functional)

  1. Distributed-systems engineering in Elixir/Erlang (or Go/Python) with a focus on reliability patterns (idempotency, graceful degradation).

  2. Cloud-native infrastructure & automation — Kubernetes on GCP, Buildkite CI/CD, Terraform or similar IaC, and scripting to eliminate manual toil.

  3. Observability & SRE tooling — designing metrics, logs, and traces that drive proactive detection and rapid remediation.

  4. Cross-team collaboration & communication — able to partner with Core Backend and multiple business units, translating reliability needs into actionable engineering work.

  5. Developer-experience mindset — empathetic API/SDK design and clear documentation that accelerates other engineers’ adoption of platform capabilities.

Benefits

  • Competitive salary and stock options

  • Comprehensive health, dental, and vision insurance

  • 401(k)

  • Flexible working hours and remote-first culture

  • Clear paths for career growth and leadership

How we work

  • Remote-first, async-heavy. Deep work valued; meetings kept minimal.

  • Light follow-the-sun escalation — because automation, testing, and observability catch issues early.

  • Blameless culture. We learn fast and systematize fixes.

Ready to build the platform that powers everything — and make reliability boring? Apply now and help us keep the core humming while unlocking new possibilities for every product team.


Similar Jobs

Highspot - Technical Support Specialist

Highspot

Hyderabad, Telangana, India (Hybrid)
1 Month ago
Motorola solutions - Hypercare Solution Analyst

Motorola solutions

Lisbon, Lisbon, Portugal (Hybrid)
1 Year ago
Enphase Energy - IT Infrastructure Lead / Architect

Enphase Energy

Fremont, California, United States (On-Site)
3 Weeks ago
Palo Alto Networks - SASE Customer Success Engineering Manager

Palo Alto Networks

London, England, United Kingdom (On-Site)
2 Months ago
BetterMe - Email Marketing Specialist (Mobile)

BetterMe

Kyiv, Kyiv City, Ukraine (Remote)
1 Month ago
TabTale - Creative Producer

TabTale

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
6 Months ago
Zenoti - Director - Product Management (B2B SaaS)

Zenoti

Hyderabad, Telangana, India (On-Site)
1 Month ago
PayPal - Product Manager

PayPal

San Jose, California, United States (Hybrid)
1 Month ago
Arkane studios - Production Director

Arkane studios

Lyon, Auvergne-Rhône-Alpes, France (On-Site)
1 Month ago
Bungie - Destiny Producer - Cinematics (Mid to Senior)

Bungie

Bellevue, Washington, United States (Hybrid)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Alpha Sense - Enterprise Account Executive, Corporate

Alpha Sense

Singapore, Singapore (On-Site)
7 Months ago
M365 connect - Dynamics 365 Expert

M365 connect

Kigali, Kigali City, Rwanda (Remote)
3 Months ago
Lorikeet - Head of Forward Deployed Engineering

Lorikeet

United States (Remote)
1 Month ago
WebMD - Senior Salesforce Solution Architect

WebMD

Boise, Idaho, United States (On-Site)
12 Months ago
Kavalirio - Case Management Assistant

Kavalirio

Sacramento, California, United States (Remote)
2 Weeks ago
Sporty - Strategy and Planning Coordinator

Sporty

São Paulo, Brazil (Hybrid)
2 Months ago
e2 open - Implementation Engineer

e2 open

Kuala Lumpur, Federal Territory Of Kuala Lumpur, Malaysia (On-Site)
2 Months ago
version 1 - Oracle Cloud Service Delivery Manager

version 1

London, England, United Kingdom (Hybrid)
3 Months ago
Voldex - Junior Art Production Manager

Voldex

Montreal, Quebec, Canada (Remote)
1 Month ago
Rippling - Senior Customer Engineer

Rippling

Seattle, Washington, United States (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

Jobs in United States

Xsolla - Performance Coach

Xsolla

Los Angeles, California, United States (Remote)
3 Months ago
Apple - Red Team Platform and Hardware Security Researcher

Apple

Cupertino, California, United States (On-Site)
1 Month ago
Regent craft - Senior Systems Safety Engineer

Regent craft

North Kingstown, Rhode Island, United States (On-Site)
3 Weeks ago
BioFire - Maintenance Technician 3 - Night Shift

BioFire

Durham, North Carolina, United States (On-Site)
2 Weeks ago
DraftKings - Financial Crimes Specialist

DraftKings

Boston, Massachusetts, United States (Remote)
1 Week ago
Apple - Systems Quality Mechatronics Engineer

Apple

Cupertino, California, United States (On-Site)
3 Weeks ago
Azra Games - Principal Technical Artist - Unity Engine Expert

Azra Games

California, United States (Remote)
3 Weeks ago
Respawn Entertainment - Animation Director (Apex Legends)

Respawn Entertainment

Los Angeles, California, United States (On-Site)
5 Months ago
JDA - Software Engineer

JDA

Dallas, Texas, United States (On-Site)
3 Weeks ago
bytedance - Senior Software Engineer, Traffic Platform

bytedance

Seattle, Washington, United States (On-Site)
9 Months ago

Get notifed when new similar jobs are uploaded

Product Management Jobs

platinum games - Producer

platinum games

Chiyoda City, Tokyo, Japan (On-Site)
1 Month ago
Discord - Senior Software Engineer, Core Product

Discord

California, United States (On-Site)
3 Weeks ago
Mendix - Product Manager - Lowcode Devops & Data Goverance

Mendix

Rotterdam, South Holland, Netherlands (Hybrid)
1 Month ago
Head Digital Works - Principal Product Manager

Head Digital Works

Hyderabad, Telangana, India (On-Site)
1 Year ago
Synthesia - Senior or Principal Product Manager

Synthesia

(Hybrid)
2 Weeks ago
Bright Machines - Test & Product Development Engineering Manager (Manufacturing)

Bright Machines

San Francisco, California, United States (On-Site)
1 Month ago
Outscal - Growth - Product Manager

Outscal

Delhi, India (On-Site)
9 Months ago
Flow - Senior Product Manager — Platform, APIs, and Extensibility

Flow

Palo Alto, California, United States (Hybrid)
1 Month ago
Corsair - Product Engineer

Corsair

Taipei City, Taiwan (On-Site)
4 Months ago
Car Gurus - Associate Product Manager, Communications Platform

Car Gurus

Boston, Massachusetts, United States (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Las Vegas, Nevada, United States (Remote)

Las Vegas, Nevada, United States (Remote)

United States (Remote)

United States (Remote)

United States (Remote)

Seattle, Washington, United States (Remote)

Los Angeles, California, United States (On-Site)

Las Vegas, Nevada, United States (On-Site)

Los Angeles, California, United States (Remote)

San Francisco, California, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Sleeper

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug