Manager, Incident Response and Service Reliability

2 Minutes ago • 8 Years + • $197,800 PA - $297,300 PA

Job Summary

Job Description

This role leads the Incident Management Team for Apple Wallet, focusing on building and leading the incident response program for one of Apple's most impactful and customer-facing services. It's a hands-on, high-accountability role requiring technical fluency, operational rigor, and strong leadership. The manager will define and operate processes for detecting, triaging, prioritizing, and mitigating service-impacting incidents, driving proactive identification of recurring issues, and leading root cause analysis to improve reliability. Collaboration with engineering, infrastructure, SRE, and product teams is key to ensuring urgent incident handling and clear communication.
Must have:
  • Define and own strategic vision for incident and problem management, integrating tooling, response structure, and continuous improvement.
  • Lead end-to-end incident response program, including severity classification, escalation protocols, stakeholder communication, and real-time coordination.
  • Own problem management function by identifying systemic issues, driving root cause analysis, and partnering with engineering for long-term fixes.
  • Manage a team of incident and problem managers, setting priorities, execution standards, and development goals.
  • Define and track operational health metrics (e.g., MTTD, MTTM, MTTR), driving improvements in detection, mitigation, and recovery.
  • Oversee adoption and evolution of incident tooling (e.g., monitoring, alerting, automation, documentation, and reporting).
  • Facilitate blameless post-incident reviews (PIRs) resulting in clear accountability, cross-functional alignment, and durable outcomes.
  • Instill a culture of operational learning and resilience, driving systemic and architectural improvements to reduce incident volume and customer impact.
Good to have:
  • Experience working in payments, banking, or other financial services companies in a developer role (SRE, DevOps or other engineering experience).
  • Experience leading incident programs across global teams or regulated environments.
  • Background in high-availability systems, payments infrastructure, or customer-critical services.
  • Familiarity with root cause analysis frameworks, postmortem facilitation, and chaos testing.
  • Experience integrating incident workflows with observability and BI platforms (e.g., Datadog, Grafana, Tableau).
  • Experience driving change in cross-functional or matrixed organizations.
Perks:
  • Comprehensive medical and dental coverage
  • Retirement benefits
  • Discounted products and free services
  • Reimbursement for certain educational expenses (including tuition)
  • Discretionary bonuses or commission payments (might be eligible)
  • Relocation (might be eligible)
  • Opportunity to become an Apple shareholder through discretionary employee stock programs
  • Ability to purchase Apple stock at a discount via Employee Stock Purchase Plan

Job Details

Are you passionate about operational excellence and protecting the customer experience? Are you drawn to solving some of the most complex and cross-functional challenges in an organization? Do you thrive on driving strategic changes that prevent problems before they happen? If so, you might be the right person to lead our Incident Management Team. This role focuses on building and leading the incident response program for Apple Wallet, one of our most impactful and customer-facing services. It’s a hands-on, high-accountability role that requires technical fluency, operational rigor, and strong leadership. At Apple, we don’t just build products- we craft the kind of wonder that’s revolutionized entire industries. Apple Wallet has changed the way we access the world, and is one of our fastest growing and most impactful services. If this excites you, apply to join our talented team.

The Product Operations team empowers Apple teams to execute at scale. We tackle complex organizational, technical, and operational challenges to ensure seamless execution and strategic alignment across Apple Wallet. As the manager for the Incident Response and Service Reliability Team, you will lead the team responsible for Apple Wallet’s real-time incident response program. You will define and operate the processes for detecting, triaging, prioritizing, and mitigating service-impacting incidents. You will drive the proactive identification of recurring issues, lead root cause analysis, and partner with engineering to implement long-term fixes that reduce risk and improve reliability. Through close collaboration with engineering, infrastructure, SRE, and product teams, you will ensure that incidents are handled with urgency, communication is clear, and issues are addressed at the root.

Key Qualifications

  • Bachelor’s degree or equivalent practical experience.
  • 8+ years of experience in incident management, technical program management, or SRE/infra leadership roles.
  • Demonstrated experience building or scaling an incident management program in a production or customer-facing environment.
  • Proven ability to define, measure, and influence operational metrics (MTTD, MTTR, etc.).
  • Strong cross-functional collaboration skills, particularly with engineering, product, and executive stakeholders.
  • Excellent communication skills under pressure, with the ability to drive clarity and urgency.
  • Experience with incident tooling (e.g., PagerDuty, Opsgenie, Slack bots, observability platforms).

Description

  • Define and own the strategic vision for incident and problem management, integrating tooling, response structure, and continuous improvement across engineering.
  • Lead the end-to-end incident response program, including severity classification, escalation protocols, stakeholder communication, and real-time coordination.
  • Own the problem management function by identifying systemic issues, driving root cause analysis, and partnering with engineering to implement long-term fixes.
  • Manage a team of incident and problem managers, setting priorities, execution standards, and development goals.
  • Define and track operational health metrics (e.g., MTTD, MTTM, MTTR), and drive improvements in detection, mitigation, and recovery timelines.
  • Oversee the adoption and evolution of incident tooling- e.g. monitoring, alerting, automation, documentation, and reporting.
  • Facilitate blameless post-incident reviews (PIRs) that result in clear accountability, cross-functional alignment, and durable outcomes.
  • Instill a culture of operational learning and resilience, drive systemic and architectural improvements to reduce incident volume, minimize customer impact, and increase operational resilience.

Preferred Qualifications

  • Experience working in payments, banking, or other financial services companies in a developer role (SRE, DevOps or other engineering experience).
  • Experience leading incident programs across global teams or regulated environments.
  • Background in high-availability systems, payments infrastructure, or customer-critical services.
  • Familiarity with root cause analysis frameworks, postmortem facilitation, and chaos testing.
  • Experience integrating incident workflows with observability and BI platforms (e.g., Datadog, Grafana, Tableau).
  • Experience driving change in cross-functional or matrixed organizations.

At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $197,800 and $297,300, and your base pay will depend on your skills, qualifications, experience, and location.

Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in San Diego, California, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Cupertino, California, United States (On-Site)

Cupertino, California, United States (On-Site)

Cupertino, California, United States (On-Site)

Sunnyvale, California, United States (On-Site)

Beaverton, Oregon, United States (On-Site)

Mesa, Arizona, United States (On-Site)

Maiden, North Carolina, United States (On-Site)

San Diego, California, United States (On-Site)

Cupertino, California, United States (On-Site)

Austin, Texas, United States (On-Site)

View All Jobs

Get notified when new jobs are added by Apple

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug