Manager, Site Reliability Engineering Tooling

9 Hours ago • All levels • Devops

Job Summary

Job Description

Toast is seeking a Manager of Site Reliability Engineering Tooling to lead their platform teams. The SRE team is responsible for overseeing Toast production services, focusing on quality, reliability, and low latency. This involves building automation tooling, developing and evangelizing best practices for scalability and observability, consulting with teams to improve systems, and participating in incident response. The manager will provide technical leadership, hands-on code contributions, and mentor a geographically distributed team. Key responsibilities include driving daily operations, developing the SRE roadmap, influencing architecture decisions, and guiding teams to build reliable systems.
Must have:
  • Manage an SRE team
  • Hands-on coding (Kotlin, Go, Python, Java/JVM)
  • Lead complex engineering projects
  • Build and run distributed systems
  • Understand systems, networking, scaling
  • Cloud infrastructure exposure
Good to have:
  • Mentoring engineers
  • Cross-functional collaboration
  • Scrum environment experience
  • Networking knowledge
  • Cloud architectures
  • SaaS solutions exposure
Perks:
  • Competitive compensation
  • Comprehensive benefits
  • Healthy lifestyle support
  • Flexible work arrangements

Job Details

Toast is driven by building the platform that helps restaurants adapt, take control, and focus on what they do best: creating experiences their guests love. Tremendous business growth has spurred a need for significant investment in Toast's platform teams. The Site Reliability Engineering team at Toast is responsible for overseeing Toast production services, with a commitment to quality, reliability, and low latency — without needing heroics. The team accomplishes this goal by:

  • Building tooling to automate, monitor, and manage deployed services using reliability best practices
  • Developing and evangelizing patterns and best practices to improve the scalability, observability, and reliability of all Toast systems
  • Consulting with teams to improve product scalability, observability, security, and reliability
  • Participating in outage response and root cause analysis for critical systems and infrastructure incidents

As a Manager of the Site Reliability Engineering team, you will provide technical leadership and hands-on code contributions, incorporating reliability best practices for programming and scripting, observability, production triage, incident resolution, and retrospective/root cause analysis to maintain the world-class reliability and uptime of our platform. 

About this roll* (Responsibilities) 

  • Enable a geographically distributed team of talented engineers to continue performing at a high level and help increase the impact of their work
  • Drive day-to-day operations of the team and contribute to the development and prioritization of the SRE roadmap for major initiatives
  • Create and drive strategic organization-wide scalability, observability, and reliability initiatives in collaboration with technical leadership and Product Management
  • Influence architecture decisions for your team and for individual services to optimize resilience and scalability
  • Guide teams to build and maintain systems that are reliable and available for Toast customers
  • Facilitate professional growth by mentoring engineers on your team

Do you have the right ingredients*? (Requirements)

  • Hands-on experience managing an SRE team, including hiring, mentoring, cross functional collaboration
  • Hands-on coding experience with Kotlin, Go, Python, Java/JVM
  • Background in leading complex engineering projects in a Scrum environment
  • Experience in building and running distributed systems
  • Exposure to networking, cloud architectures, and patterns 
  • Deep understanding of systems, networking, and scaling issues
  • Direct exposure to cloud infrastructure and SaaS solutions

**This is a hybrid role requiring in-office presence two days per week**

Our Spread* of Total Rewards
We strive to provide competitive compensation and benefits programs that help to attract, retain, and motivate the best and brightest people in our industry. Our total rewards package goes beyond great earnings potential and provides the means to a healthy lifestyle with the flexibility to meet Toasters’ changing needs. Learn more about our benefits at https://careers.toasttab.com/toast-benefits.

*Bread puns encouraged but not required



 

Diversity, Equity, and Inclusion is Baked into our Recipe for Success

At Toast, our employees are our secret ingredient—when they thrive, we thrive. The restaurant industry is one of the most diverse, and we embrace that diversity with authenticity, inclusivity, respect, and humility. By embedding these principles into our culture and design, we create equitable opportunities for all and raise the bar in delivering exceptional experiences.

We Thrive Together

We embrace a hybrid work model that fosters in-person collaboration while valuing individual needs. Our goal is to build a strong culture of connection as we work together to empower the restaurant community. To learn more about how we work globally and regionally, check out: https://careers.toasttab.com/locations-toast.

Apply today!

Toast is committed to creating an accessible and inclusive hiring process. As part of this commitment, we strive to provide reasonable accommodations for persons with disabilities to enable them to access the hiring process. If you need an accommodation to access the job application or interview process, please contact candidateaccommodations@toasttab.com.

------

For roles in the United States, It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.

Similar Jobs

Forescout Technologies  Inc  - Sr. Software Engineer

Forescout Technologies Inc

Dallas, Texas, United States (Hybrid)
3 Months ago
bytedance - Benefits Business Partner - Los Angeles

bytedance

Los Angeles, California, United States (On-Site)
7 Months ago
Balbix - Sr Staff/Principal Devops Engineer

Balbix

Gurugram, India (On-Site)
11 Months ago
Nasdaq - VP of Partnerships

Nasdaq

New York, New York, United States (On-Site)
1 Month ago
Sonar Source - Partner Manager

Sonar Source

United Arab Emirates (Remote)
5 Months ago
Veeam Software - Platform Engineer

Veeam Software

Pune, Maharashtra, India (Hybrid)
1 Month ago
Rackner - AWS Cloud Architect

Rackner

Honolulu, Hawaii, United States (On-Site)
1 Month ago
Ion - Cloud Engineer/Architect (DevOps)

Ion

London, England, United Kingdom (On-Site)
8 Months ago
Rackspace Technology - AWS DevOps Engineer

Rackspace Technology

Gurugram, Haryana, India (Hybrid)
1 Month ago
bytedance - Cloud Solution Architect, BytePlus - Indonesia

bytedance

Jakarta, Jakarta, Indonesia (On-Site)
8 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Telastra - Full Stack Developer

Telastra

Singapore (On-Site)
1 Month ago
Figma - Account Executive, Enterprise, Portuguese Speaking

Figma

São Paulo, Brazil (On-Site)
2 Weeks ago
bytedance - Innovation Tech Solution Sales

bytedance

Gurugram, Haryana, India (On-Site)
2 Months ago
AiDash - Principal Data Scientist, Wildfire Management

AiDash

Palo Alto, California, United States (Remote)
9 Months ago
Sumo logic - Staff QE Software Engineer

Sumo logic

Bengaluru, Karnataka, India (Remote)
2 Months ago
Addepar - Product Design Manager

Addepar

Canada (Remote)
1 Month ago
Nice - Specialist Software Engineer (UI- Angular)

Nice

Pune, Maharashtra, India (Hybrid)
3 Weeks ago
Toku - Product Marketing Manager

Toku

United States (Remote)
5 Months ago
PwC - Manager|Oracle fusion Technical|Oracle|Advisory|Kolkata

PwC

Kolkata, West Bengal, India (On-Site)
9 Months ago
Nightfall AI - Operations Coordinator

Nightfall AI

San Francisco, California, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Dublin, County Dublin, Ireland

Toast - Software Engineer II - Fintech Pricing

Toast

Dublin, County Dublin, Ireland (Hybrid)
2 Weeks ago
Virtuos - Senior/Lead AI Technical Animator (12-Month FTC)

Virtuos

Dublin, County Dublin, Ireland (Hybrid)
2 Months ago
Cadence - Principal Application Engineer

Cadence

Cork, County Cork, Ireland (On-Site)
1 Month ago
Eventbrite - Director of Sales, EMEA

Eventbrite

Ireland (Remote)
1 Month ago
playrix  - AI Motion Designer

playrix

Ireland (Remote)
3 Months ago
Rippling - Payments Compliance Specialist

Rippling

Dublin, County Dublin, Ireland (Hybrid)
3 Weeks ago
Riot Games - Principal Software Engineer, Gameplay - Teamfight Tactics

Riot Games

Dublin, County Dublin, Ireland (On-Site)
7 Months ago
Toast - Director of Engineering, Fintech

Toast

Dublin, County Dublin, Ireland (Hybrid)
2 Weeks ago
Lighthouse Studios - Mid-Senior Toon Boom Animators (Rick and Morty & Top Secret Series)

Lighthouse Studios

Kilkenny, County Kilkenny, Ireland (On-Site)
3 Months ago
PayPal - Data Analyst 2

PayPal

Dublin, County Dublin, Ireland (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Devops Jobs

CyberArk - Senior Backend Software Engineer, Golang, Cloud Native

CyberArk

Santa Clara, California, United States (Hybrid)
1 Month ago
Google - Software Engineering Manager II, Site Reliability Engineering

Google

Zürich, Zurich, Switzerland (On-Site)
2 Months ago
Palo Alto Networks - Marketplace Operations Manager (Cloud Service Providers)

Palo Alto Networks

London, England, United Kingdom (On-Site)
1 Month ago
Autodesk - Principal Engineer - AWS - OpenSearch/BedRock

Autodesk

Bengaluru, Karnataka, India (On-Site)
1 Month ago
bytedance - Machine Learning Engineer-Model Serving Infrastructure (AML-Engine)

bytedance

Seattle, Washington, United States (On-Site)
8 Months ago
Build Staff - Hyper Converged Infrastructure Engineer

Build Staff

Wichita, Kansas, United States (On-Site)
6 Years ago
Meta - Software Engineer, Infrastructure

Meta

Redmond, Washington, United States (Remote)
7 Months ago
Rackspace Technology - Site Reliability Engineer III

Rackspace Technology

India (Remote)
4 Months ago
London stock Exchange - Lead Platform Engineer, Manager

London stock Exchange

London, England, United Kingdom (On-Site)
1 Month ago
Temporal Technologies - Senior Developer Success Engineer - Infrastructure

Temporal Technologies

(Remote)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

Sydney, New South Wales, Australia (On-Site)

Toronto, Ontario, Canada (Hybrid)

Dublin, County Dublin, Ireland (Hybrid)

Phoenix, Arizona, United States (Hybrid)

Los Angeles, California, United States (On-Site)

San Francisco, California, United States (On-Site)

Dublin, County Dublin, Ireland (Hybrid)

Dublin, County Dublin, Ireland (Hybrid)

View All Jobs

Get notified when new jobs are added by Toast

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug