Home >

Jobs >

Senior Site Reliability Engineering Manager

Microsoft

Washington, United States (Hybrid)

Senior Site Reliability Engineering Manager

8 Months ago • 6-7 Years • Network Engineering • $117,200 PA - $250,200 PA

Job Summary

Job Description

This Senior Site Reliability Engineering Manager role at Microsoft Azure focuses on managing a large-scale storage service fleet. Responsibilities include optimizing fleet availability, health, and efficiency; leading a team in designing and developing automation and uptime improvements; investigating and solving complex issues at scale; and driving sprint planning and cross-team collaboration. The role requires deep understanding of massive distributed systems, server architecture, and troubleshooting. Incident response and post-mortem reporting are also key components. The position offers significant impact on cost reduction and high-level visibility within the organization.

Must have:

6+ years experience in relevant field
4+ years Agile/SCRUM experience
Lead large cross-team efforts
Develop, test, implement code changes
Investigate and resolve hardware/system issues
Incident response and post-mortem reporting

Good to have:

Understanding of server architecture and components
Familiarity with distributed systems
Experience with management techniques and scope control

Perks:

Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Networking opportunities

2 skills required

2 skills required for this role

Add these skills to join the top 1% applicants for this job

azure

scalability

Job Details

Overview

Are you passionate about hardware and enabling new technology? Do you enjoy complex problem solving and investigation? Azure has one of the largest storage services on the planet, holding Exabytes of data and files not just for our 3rd party customers, but also many of Microsoft’s own services. This role will focus on managing an ever growing and changing fleet at scale to maximize efficiency while providing a stable environment for our customers.

As a Senior Site Reliability Engineering Manager in Azure Storage team you will be working with a team of engineers focused on optimizing fleet availability and health. Leading a team of engineers to design, develop and improve automation and uptime. You will take lead of planning, investigating complex issues and designing solutions to solve problems at scale.

This opportunity will allow you to deepen your knowledge and experience with massive distributed systems. Opportunities to have significant impact on reducing cost to the business. Exposure and visibility at VP and CVP levels. This position is located in Redmond and has a flexible work environment that supports working from home.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Required Qualifications:

6+ years technical experience in software engineering, network engineering, or systems administration
- OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration
- OR Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.
4+ years of Agile / SCRUM planning, and leading large cross team efforts.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: 
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

7+ years technical experience in software engineering, network engineering, or systems administration
- OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration
- OR Master's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering,
Understanding of server architecture and the ability to debug and trouble shoot isues impacting the fleet.
Understadning of server componants, Firmware, BIOS and how they interact.
Understanding management techinques, and methods for ensuring scope control.
Familiarity with distributed systems.

Site Reliability Engineering M4 - The typical base pay range for this role across the U.S. is USD $117,200 - $229,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $153,600 - $250,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Microsoft will accept applications for the role until September 9, 2024.

#azurecorejobs

Responsibilities

Develop, test, and implement changes to optimize code and improve scalability. You leverage end-to-end technical expertise and telemetry analysis to identify patterns and opportunities to implement configuration and automation improvments. You review the effect of changes to documents and share development insights within your team.
You drive Sprint planning, SCRUM stand ups, code/design reviews, and host regular cross team / org meetings.
Investigate hardware and system issues that are impacting available capacity and impacting customers.
Understand the long term goals of the organization and understand the steps your team will have to take to achieve those.
You respond to incidents during regular on-call rotations and share details related to incidents and their resolution through post-mortem reports and regular review meetings. As a member of the team you willl be expected to help drive bridges for recovery durring major outages.
Embody our and

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

Industry leading healthcare

Educational resources

Discounts on products and services

Savings and investments

Maternity and paternity leave

Generous time away

Giving programs

Opportunities to network and connect

Similar Jobs

Senior Customer Support Engineer, Japan

Logitech

Tokyo, Japan (Hybrid)

• 9 Months ago

Senior Cybersecurity Program Manager

Microsoft

Redmond, Washington, United States (Hybrid)

• 8 Months ago

Cloud Engineer

McCain Foods

New Delhi, Delhi, India (Hybrid)

• 12 Months ago

Senior Back-End Engineer (Accessibility Product House)

Sigma Software

Warsaw, Masovian Voivodeship, Poland (On-Site)

• 10 Months ago

Sr Staff Fullstack Engineer, Anonym

Mozilla

(Remote)

• 11 Months ago

Staff Software Engineer, Google Enterprise Network

Google

Bengaluru, Karnataka, India (On-Site)

• 8 Months ago

Technical Project Manager - Physical Network Infrastructure

ByteDance

Singapore (On-Site)

• 10 Months ago

Research Intern - Networking

Microsoft

Redmond, Washington, United States (On-Site)

• 8 Months ago

Site Reliability Engineer, Traffic Platform

ByteDance

San Jose, California, United States (On-Site)

• 10 Months ago

Plant Engineer, Data Center Network Engineering and Cybersecurity

Google

Kansas City, Missouri, United States (On-Site)

• 8 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Job Opportunity : Support Technician (Help Desk)

Rackspace Technology

Gurugram, Haryana, India (Hybrid)

• 8 Months ago

Senior AI Data Scientist

Hitachi

Chennai, Tamil Nadu, India (On-Site)

• 10 Months ago

Machine Learning Engineer

Axinous

Bengaluru, Karnataka, India (On-Site)

• 8 Months ago

Sr Lead Digital Software Engineer - Front End

Buckman

Chennai, Tamil Nadu, India (On-Site)

• 11 Months ago

Data Engineer

Varonis

Herzliya, Tel Aviv District, Israel (Hybrid)

• 8 Months ago

Sr. Full Stack Engineer, Training & Coaching

Highspot

Hyderabad, Telangana, India (Hybrid)

• 11 Months ago

QA-AUTOMATION

Nagarro

Cairo, Cairo Governorate, Egypt (On-Site)

• 10 Months ago

IN-Manager_ Java and Python _Risk Analytics _Advisory_ Gurugram

PwC

Gurugram, Haryana, India (On-Site)

• 8 Months ago

Senior Software Engineer - Data Platform (Mercury)

GoTo Group

Bengaluru, Karnataka, India (On-Site)

• 10 Months ago

Senior Full Stack (.NET+Angular) Engineer (#2623)

N-iX

Poland (Hybrid)

• 8 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

Training Facilitator, Call Center

Blinkhealth

St. Louis, Missouri, United States (On-Site)

• 9 Months ago

Salesforce Devops Engineer

Next Level Business Services

Agoura Hills, California, United States (On-Site)

• 10 Months ago

Production Manager - (Print)

IGT

Lakeland, Florida, United States (On-Site)

• 9 Months ago

Sr. Cybersecurity Engineer

Warner Bros Discovery

Georgia, United States (Hybrid)

• 8 Months ago

Senior Software Engineer, Revenue Growth

Discord

San Francisco, California, United States (Remote)

• 10 Months ago

Product Manager

Summer 2025 Intern- Tableau Research

Salesforce

Palo Alto, California, United States (On-Site)

• 11 Months ago

Image Sensor Architect - Pico - San Jose

ByteDance

San Jose, California, United States (On-Site)

• 8 Months ago

Senior Pricing Analyst

On Location

Texas, United States (Remote)

• 8 Months ago

Associate Program Manager | Irvine, CA

Blizzard Entertainment

Irvine, California, United States (Hybrid)

• 9 Months ago

Get notifed when new similar jobs are uploaded

Network Engineering Jobs

Staff Software Engineer, Google Enterprise Network

Google

Bengaluru, Karnataka, India (On-Site)

• 8 Months ago

Senior Network Design Engineer, Google Cloud

Google

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)

• 8 Months ago

Software Engineer Graduate (Multi Cloud CDN) - 2025 Start (BS/MS)

ByteDance

Seattle, Washington, United States (On-Site)

• 10 Months ago

Site Reliability Engineer Graduate (Technical Infrastructure) - 2025 Start (BS/MS)

ByteDance

San Jose, California, United States (On-Site)

• 10 Months ago

Software Engineer, Real Time Communication

ByteDance

Singapore (On-Site)

• 10 Months ago

CPU Application Platform Engineer Graduate (Server Platform)- 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)

• 10 Months ago

Network Architecture Intern, Summer 2025

Netflix

Los Gatos, California, United States (On-Site)

• 10 Months ago

Senior Software Engineer, Cloud Infrastructure

ByteDance

San Jose, California, United States (On-Site)

• 8 Months ago

Software Engineer - Datacenter networking

Senior Optical Network Development Engineer

Microsoft

London, England, United Kingdom (On-Site)

• 8 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft

16 Active Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

A global community of game builders. Helping people upskill and land jobs in the best gaming studios.

Company

Key Links

hello@outscal.com

Made in INDIA 💛💙

Senior Site Reliability Engineering Manager

Job Summary

Job Description

2 skills required

2 skills required for this role

Job Details

Overview

Qualifications

Responsibilities

Similar Jobs

Senior Customer Support Engineer, Japan

Senior Cybersecurity Program Manager

Cloud Engineer

Senior Back-End Engineer (Accessibility Product House)

Sr Staff Fullstack Engineer, Anonym

Staff Software Engineer, Google Enterprise Network

Technical Project Manager - Physical Network Infrastructure

Research Intern - Networking

Site Reliability Engineer, Traffic Platform

Plant Engineer, Data Center Network Engineering and Cybersecurity

Similar Skill Jobs

Job Opportunity : Support Technician (Help Desk)

Senior AI Data Scientist

Machine Learning Engineer

Sr Lead Digital Software Engineer - Front End

Data Engineer

Sr. Full Stack Engineer, Training & Coaching

QA-AUTOMATION

IN-Manager_ Java and Python _Risk Analytics _Advisory_ Gurugram

Senior Software Engineer - Data Platform (Mercury)

Senior Full Stack (.NET+Angular) Engineer (#2623)

Jobs in Redmond, Washington, United States

Training Facilitator, Call Center

Salesforce Devops Engineer

Production Manager - (Print)

Sr. Cybersecurity Engineer

Senior Software Engineer, Revenue Growth

Product Manager

Summer 2025 Intern- Tableau Research

Image Sensor Architect - Pico - San Jose

Senior Pricing Analyst

Associate Program Manager | Irvine, CA

Network Engineering Jobs

Staff Software Engineer, Google Enterprise Network

Senior Network Design Engineer, Google Cloud

Software Engineer Graduate (Multi Cloud CDN) - 2025 Start (BS/MS)

Site Reliability Engineer Graduate (Technical Infrastructure) - 2025 Start (BS/MS)

Software Engineer, Real Time Communication

CPU Application Platform Engineer Graduate (Server Platform)- 2025 Start (PhD)

Network Architecture Intern, Summer 2025

Senior Software Engineer, Cloud Infrastructure

Software Engineer - Datacenter networking

Senior Optical Network Development Engineer

About The Company

Member of Technical Staff, AI Pretraining Platform

Senior Applied Scientist

Principal Product Designer

Member of Technical Staff - Backend Engineer, Product

Member of Technical Staff, AI - Multimodal

Member of Technical Staff, AI - Reinforcement Learning Systems

Senior Technical Program Manager, Copilot AI

Executive Business Administrator

Principal Growth Product Manager - Copilot

Member of Technical Staff, AI Multimodal

Level Up Your Career in Game Development!