DevOps Engineer, Site Reliability Engineering (SRE)

2 Months ago • 7 Years + • Devops

Job Summary

Job Description

LMArena is seeking an experienced, security-minded Site Reliability Engineer (SRE) to manage and enhance its infrastructure, processes, and operational security. The role involves end-to-end ownership of infrastructure across Cloudflare, Vercel, and CI/CD pipelines, embedding security best practices, and establishing efficient onboarding and mentorship processes. This position is ideal for a seasoned SRE who excels at the intersection of reliability, performance, and security, supporting fast-moving product teams.
Must have:
  • 7+ years in SRE/DevOps for SaaS/consumer web products
  • Expertise securing/scaling Cloudflare/Vercel
  • Deep understanding of web app security, networking, TLS, zero-trust
  • Strong IaC (Terraform/Pulumi) & CI/CD (GitHub Actions) skills
  • Proficiency in Golang, Python, TypeScript, scripting
  • Experience with change management workflows
  • Excellent written communication for runbooks/docs
  • Track record of mentoring junior engineers
Good to have:
  • Experience with Kubernetes/Nomad
  • Experience with serverless stacks
  • Relevant certifications (AWS/GCP Pro, GIAC, CKS, CISSP)
Perks:
  • Impact on a growing AI benchmarking platform
  • Engineering-first, documentation-driven, community-obsessed culture
  • Competitive salary, equity, comprehensive benefits
  • Professional development budget

Job Details

About the Company

LMArena is an engineering-first startup redefining how the world evaluates large language models. Created in 2023 by UC Berkeley researchers, our neutral, community-driven benchmarking platform attracts over one million monthly users—pairwise comparing leading models from OpenAI, Google, Anthropic, and more—to deliver real-time insights into the rapidly evolving LLM landscape. LMArena is scaling fast to build the next generation of AI testing infrastructure and set the industry standard for model evaluation.

Position Overview

We are seeking an experienced, security‑minded Site Reliability Engineer to own and elevate our infrastructure, processes, and operational security. We are cognizant that much of the security domain sits within the SRE’s world these days and are building our team accordingly. You will:

  • Take end‑to‑end ownership of infrastructure operations across Cloudflare, Vercel, and our CI/CD pipelines.

  • Embed security best practices into every layer of the stack, ensuring resilience against emerging threats.

  • Establish processes and procedures that promote efficient onboarding and ramping up new team members, and mentor incoming and more junior members of the team.

This role is ideal for a seasoned SRE who thrives at the intersection of reliability, performance, and security, and who brings the rigor needed to keep fast‑moving product teams focused on innovation.

Key Responsibilities

  • Infrastructure as Code – Manage Terraform modules and secrets pipelines; champion immutable, auditable infrastructure.

  • Cloudflare Operations – Configure, monitor, and harden WAF, DDoS protections, bot management, and CDN caching strategies.

  • Vercel & Edge Runtime – Own deployment architecture, performance tuning, and incident response for our Next.js‑based front end and Edge Functions.

  • CI/CD & Release Engineering – Design, implement, and maintain secure pipelines (GitHub Actions, Vercel integrations) with automated testing and vulnerability scanning.

  • Change Management & Documentation – Establish and enforce a lightweight but disciplined RFC/change‑control process; maintain comprehensive runbooks and architecture diagrams.

  • Observability & Incident Response – Expand monitoring, logging, and alerting; lead post‑incident reviews and drive continual improvement.

  • Mentorship – Provide day‑to‑day guidance to engineers and junior SREs, fostering a culture of ownership and learning.

  • Compliance Support – Partner with ProdSec and GRC teams on SOC 2, ISO 27001, and customer security questionnaires.

  • Manage and maintain internal and external facing infrastructure

  • Maintain and configure log aggregation requirements, and the infrastructure used to store them across the business

Required Qualifications

  • 7+ years in SRE/DevOps roles for high‑traffic SaaS or consumer web products.

  • Proven expertise securing and scaling Cloudflare and Vercel (or comparable CDN/edge and serverless platforms).

  • Deep understanding of web application security, networking, TLS, and zero‑trust principles.

  • Strong proficiency with infrastructure as code (Terraform, Pulumi, or similar), and serverless build pipelines (GitHub Actions or similar)

  • Strong programming abilities (Golang, python, TypeScript) and scripting

  • Demonstrated success designing and enforcing change‑management workflows.

  • Excellent written communication—able to produce clear runbooks and architecture docs.

  • Track record mentoring or leading junior engineers.

Nice‑to‑Have

  • Experience with container orchestration (Kubernetes or Nomad).

  • Experience with serverless stacks.

  • Certifications such as AWS/GCP Professional, GIAC‑GCSA, CKS, or CISSP.

Why You’ll Love Working Here

  • Impact – You’ll set the foundation for reliability and security across a rapidly growing AI benchmarking platform.

  • Culture – Engineering‑first, documentation‑driven, and community‑obsessed.

  • Compensation – Competitive salary, meaningful equity, comprehensive benefits, and professional‑development budget.

Similar Jobs

bytedance - Customer Success Manager - Lark - Philippines

bytedance

Taguig, Metro Manila, Philippines (On-Site)
9 Months ago
Airbyte - Solutions Engineer

Airbyte

San Francisco, California, United States (On-Site)
3 Months ago
DataVisor - Quality Assurance Engineer

DataVisor

Japan (Remote)
3 Months ago
Globalization Partners - Principal Software Engineer

Globalization Partners

Northern Ireland, United Kingdom (Remote)
2 Months ago
USE Insider - Digital Designer

USE Insider

Istanbul, İstanbul, Türkiye (Hybrid)
9 Months ago
NCR Atleos - Site Reliability Engineer III

NCR Atleos

Gurugram, Haryana, India (On-Site)
1 Month ago
CD PROJEKT RED - Senior DevOps Engineer

CD PROJEKT RED

Warsaw, Masovian Voivodeship, Poland (On-Site)
3 Months ago
Palo Alto Networks - Senior Consulting Director, Cloud Security, Proactive Services (Unit 42)

Palo Alto Networks

Washington, District Of Columbia, United States (Remote)
1 Month ago
Morning Star - Infrastructure Engineer

Morning Star

Mumbai, Maharashtra, India (Hybrid)
1 Year ago
zoox - Staff/Senior Staff Software Platform Engineer

zoox

Foster City, California, United States (Hybrid)
9 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Sierra - Recruiter

Sierra

London, England, United Kingdom (On-Site)
1 Month ago
smartbear - Associate Engineering Manager – TestComplete

smartbear

Bengaluru, Karnataka, India (Hybrid)
1 Month ago
Glean - Business Development Representative (EMEA shift hours)

Glean

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Jumio - Vice President of Finance

Jumio

Sunnyvale, California, United States (On-Site)
3 Weeks ago
DevRev - Account Executive - Mid Market

DevRev

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Glean - Senior/Staff Data Scientist, Core Product

Glean

Palo Alto, California, United States (Hybrid)
3 Months ago
PowerSchool - Associate Cloud Operations Engineer 2

PowerSchool

Bengaluru, Karnataka, India (On-Site)
9 Months ago
Zinnia - Revenue Operations Manager (Data and Analytics)

Zinnia

New York, New York, United States (Hybrid)
4 Weeks ago
Ness - VP/Chief Architect

Ness

New Jersey, United States (Hybrid)
1 Month ago
Instrumental - Senior Product Manager

Instrumental

Oakland, California, United States (Remote)
8 Months ago

Get notifed when new similar jobs are uploaded

Jobs in California, United States

Axon - Distribution Manager

Axon

Sterling, Virginia, United States (On-Site)
3 Months ago
Morning Star - Senior Quant Analyst

Morning Star

Chicago, Illinois, United States (Hybrid)
1 Year ago
Decagon - Senior Software Engineer, Product

Decagon

San Francisco, California, United States (On-Site)
1 Month ago
Internet Brands - Inside Sales Representative

Internet Brands

El Segundo, California, United States (On-Site)
2 Months ago
bytedance - Technical Writer

bytedance

San Jose, California, United States (On-Site)
2 Months ago
100 Thieves - Esports Content Intern

100 Thieves

Culver City, California, United States (On-Site)
1 Year ago
Oculus VR - Audio Software Engineer

Oculus VR

Redmond, Washington, United States (On-Site)
1 Month ago
Toast - Retail Account Executive

Toast

Santa Rosa, California, United States (On-Site)
2 Months ago
CD PROJEKT RED - Senior Gameplay Animator

CD PROJEKT RED

Boston, Massachusetts, United States (Hybrid)
3 Months ago
Solace - Healthcare Advocate

Solace

United States (Remote)
6 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Zazz - Cloud Engineer (Azure)

Zazz

(Remote)
6 Months ago
Interactive Brokers - Cloud Platform Engineer

Interactive Brokers

Hyderabad, Telangana, India (Hybrid)
3 Months ago
Devoteam - Cloud Hybride Engineer H/F

Devoteam

Levallois-Perret, Île-de-France, France (Remote)
9 Months ago
bytedance - GPU/AI Application Platform Engineer Intern (Server Platform)

bytedance

San Jose, California, United States (On-Site)
6 Months ago
USE Insider - Solution Architect

USE Insider

Istanbul, İstanbul, Türkiye (On-Site)
6 Months ago
bytedance - Senior Software Engineer, Multi Cloud CDN

bytedance

San Jose, California, United States (On-Site)
3 Months ago
Grammarly - Site Reliability Engineer

Grammarly

San Francisco, California, United States (Hybrid)
1 Month ago
Saviynt - Manager Cloud Security, Infosec

Saviynt

Bengaluru, Karnataka, India (Hybrid)
8 Months ago
Trend Micro - (Sr.) Cloud Developer (Security Playbooks)

Trend Micro

Taipei City, Taiwan (On-Site)
10 Months ago
Xepelin - Senior DevOps Engineer

Xepelin

Buenos Aires, Buenos Aires, Argentina (Remote)
1 Year ago

Get notifed when new similar jobs are uploaded

About The Company

San Francisco, California, United States (Hybrid)

California, United States (Hybrid)

United States (Remote)

California, United States (Hybrid)

United States (Remote)

California, United States (Hybrid)

California, United States (Hybrid)

California, United States (Hybrid)

California, United States (Hybrid)

California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by LMArena

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug