Jobs Courses Resources Companies Placements

Home >

Jobs >

Senior Site Reliability Engineering Manager

Microsoft

Washington, United States (Hybrid)

Senior Site Reliability Engineering Manager

7 Months ago • 6-7 Years • Network Engineering • $117,200 PA - $250,200 PA

Job Summary

Job Description

The Senior Site Reliability Engineering Manager at Azure Storage will lead a team optimizing fleet availability and health for one of the world's largest storage services. Responsibilities include designing, developing, and improving automation and uptime; investigating complex issues at scale; and planning solutions to maximize efficiency. This role requires strong leadership in Agile/SCRUM, incident response, and cross-team collaboration. Significant impact on cost reduction and high-level visibility are key aspects. The position involves developing, testing, and implementing code changes for scalability, troubleshooting hardware and system issues, and understanding long-term organizational goals. The role includes on-call rotations and post-mortem reporting.

Must have:

6+ years experience in relevant field
4+ years in Agile/SCRUM leadership
Expertise in distributed systems
Problem-solving and investigation skills
Develop, test, and implement code changes
Incident response and post-mortem reporting

Good to have:

Understanding of server architecture
Familiarity with server components, firmware, BIOS
Understanding management techniques and scope control

Perks:

Industry-leading healthcare
Educational resources
Product and service discounts
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Networking opportunities

3 skills required

3 skills required for this role

Add these skills to join the top 1% applicants for this job

azure

agile-development

scalability

Job Details

Overview

Are you passionate about hardware and enabling new technology? Do you enjoy complex problem solving and investigation? Azure has one of the largest storage services on the planet, holding Exabytes of data and files not just for our 3rd party customers, but also many of Microsoft’s own services. This role will focus on managing an ever growing and changing fleet at scale to maximize efficiency while providing a stable environment for our customers.

As a Senior Site Reliability Engineering Manager in Azure Storage team you will be working with a team of engineers focused on optimizing fleet availability and health. Leading a team of engineers to design, develop and improve automation and uptime. You will take lead of planning, investigating complex issues and designing solutions to solve problems at scale.

This opportunity will allow you to deepen your knowledge and experience with massive distributed systems. Opportunities to have significant impact on reducing cost to the business. Exposure and visibility at VP and CVP levels. This position is located in Redmond and has a flexible work environment that supports working from home.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Required Qualifications:

6+ years technical experience in software engineering, network engineering, or systems administration
- OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration
- OR Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.
4+ years of Agile / SCRUM planning, and leading large cross team efforts.

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: 
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

7+ years technical experience in software engineering, network engineering, or systems administration
- OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration
- OR Master's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering,
Understanding of server architecture and the ability to debug and trouble shoot isues impacting the fleet.
Understadning of server componants, Firmware, BIOS and how they interact.
Understanding management techinques, and methods for ensuring scope control.
Familiarity with distributed systems.

Site Reliability Engineering M4 - The typical base pay range for this role across the U.S. is USD $117,200 - $229,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $153,600 - $250,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Microsoft will accept applications for the role until September 9, 2024.

#azurecorejobs

Responsibilities

Develop, test, and implement changes to optimize code and improve scalability. You leverage end-to-end technical expertise and telemetry analysis to identify patterns and opportunities to implement configuration and automation improvments. You review the effect of changes to documents and share development insights within your team.
You drive Sprint planning, SCRUM stand ups, code/design reviews, and host regular cross team / org meetings.
Investigate hardware and system issues that are impacting available capacity and impacting customers.
Understand the long term goals of the organization and understand the steps your team will have to take to achieve those.
You respond to incidents during regular on-call rotations and share details related to incidents and their resolution through post-mortem reports and regular review meetings. As a member of the team you willl be expected to help drive bridges for recovery durring major outages.
Embody our and

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

Industry leading healthcare

Educational resources

Discounts on products and services

Savings and investments

Maternity and paternity leave

Generous time away

Giving programs

Opportunities to network and connect

Similar Jobs

Business Systems Architect – Operations (D365 FO)

Axon

Scottsdale, Arizona, United States (Hybrid)

• 9 Months ago

Ingénieur Microsoft 365 H/F

DEVOTEAM

(Remote)

• 10 Months ago

Principal Software Engineer - VALORANT, Foundations, Build Platforms

Riot Games

Los Angeles, California, United States (On-Site)

• 11 Months ago

Senior Engineer, Field Engineering

Saviynt

Bengaluru, Karnataka, India (Hybrid)

• 10 Months ago

Cloud Engineer / Security and Compliance Specialist

ARHS

Brussels, Brussels, Belgium (Remote)

• 9 Months ago

Lead Network Programmer

Activision

Malmö, Skåne County, Sweden (Hybrid)

• 8 Months ago

Technical Program Manager, Global Network Infrastructure

Google

(On-Site)

• 8 Months ago

Senior Software Engineer

Microsoft

Dublin, County Dublin, Ireland (On-Site)

• 8 Months ago

Senior Network Operations Engineer

The Walt Disney Company

Bristol, Connecticut, United States (On-Site)

• 9 Months ago

Site Reliability Engineer - Game

ByteDance

Singapore (On-Site)

• 10 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Senior (SDE3) DevOps Engineer

Luxoft

Bengaluru, Karnataka, India (On-Site)

• 9 Months ago

Software Engineer

Microsoft

Bucharest, Bucharest, Romania (On-Site)

• 8 Months ago

Quality Assurance Tester

Quantzig

India (Remote)

• 11 Months ago

Programmeur.se backend / Backend Programmer

ZeniMax Media

Montreal, Quebec, Canada (On-Site)

• 11 Months ago

DevOps Engineer

Qatar Airways

Ahmedabad, Gujarat, India (On-Site)

• 11 Months ago

Unity Game Programmer

Crimson Games LLC

India (Remote)

• 11 Months ago

.NET Developer

Next Level Business Services

Minneapolis, Minnesota, United States (On-Site)

• 10 Months ago

Infrastructure Engineer - III

InMobiInMobi

Bengaluru, Karnataka, India (On-Site)

• 10 Months ago

Site Reliability Engineer - EP (SE4)

GoTo Group

Gurugram, Haryana, India (On-Site)

• 10 Months ago

Sitecore Architect/Developer

CloudHire

Mumbai, Maharashtra, India (Remote)

• 10 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

Model Governance Senior Analyst

Interactive Brokers

Chicago, Illinois, United States (Hybrid)

• 10 Months ago

Lead Product Design Engineer

Backbone

Atherton, California, United States (On-Site)

• 1 Year ago

SAP WM (Full Time)

Next Level Business Services

Naples, Florida, United States (On-Site)

• 10 Months ago

Production Ops Manager, Marketing Production - UCAN

Netflix

Los Angeles, California, United States (On-Site)

• 10 Months ago

Director, RISE Content Advisement

The Walt Disney Company

Burbank, California, United States (On-Site)

• 9 Months ago

Future Customer Experience Mechanical/Thermal Engineer

Fort Collins, Colorado, United States (Hybrid)

• 10 Months ago

Product Manager - Community

Twitch

New York, New York, United States (On-Site)

• 8 Months ago

B2B Sales Associate

Onward Search

Richmond, Virginia, United States (On-Site)

• 9 Months ago

Systems Engineer II Broadcast

The Walt Disney Company

New York, New York, United States (On-Site)

• 8 Months ago

Web Developer Internship

WebFX

Harrisburg, Pennsylvania, United States (On-Site)

• 10 Months ago

Get notifed when new similar jobs are uploaded

Network Engineering Jobs

Technical Program Manager, Global Network Infrastructure Engineering and Delivery

Google

Dublin, County Dublin, Ireland (On-Site)

• 8 Months ago

Software Developer, Routing & Emulation Graduate - 2024 Start (PhD)

ByteDance

Seattle, Washington, United States (On-Site)

• 10 Months ago

Senior Software Engineer (Stability Platform) - Traffic Infrastructure

ByteDance

Singapore (On-Site)

• 10 Months ago

Network Production Engineer, Network Infrastructure

Network Engineer

PearlAbyss

(On-Site)

• 7 Months ago

Jr. QA Engineer

Forescout Technologies Inc

Texas, United States (Hybrid)

• 8 Months ago

Senior Software Engineer, Multi Cloud CDN - San Jose / Seattle / Boston

ByteDance

Boston, Massachusetts, United States (On-Site)

• 8 Months ago

Sr. Systems Engineer LA/MS

Extreme Network

Louisiana, United States (Remote)

• 10 Months ago

Network Software Development Engineer Graduate (Network Engineering-High Speed Network) - 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)

• 10 Months ago

Software Engineer - Datacenter networking

About The Company

Microsoft

17 Active Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

A global community of game builders. Helping people upskill and land jobs in the best gaming studios.

Company

Key Links

hello@outscal.com

Made in INDIA 💛💙

Senior Site Reliability Engineering Manager

Job Summary

Job Description

3 skills required

3 skills required for this role

Job Details

Overview

Qualifications

Responsibilities

Similar Jobs

Business Systems Architect – Operations (D365 FO)

Ingénieur Microsoft 365 H/F

Principal Software Engineer - VALORANT, Foundations, Build Platforms

Senior Engineer, Field Engineering

Cloud Engineer / Security and Compliance Specialist

Lead Network Programmer

Technical Program Manager, Global Network Infrastructure

Senior Software Engineer

Senior Network Operations Engineer

Site Reliability Engineer - Game

Similar Skill Jobs

Senior (SDE3) DevOps Engineer

Software Engineer

Quality Assurance Tester

Programmeur.se backend / Backend Programmer

DevOps Engineer

Unity Game Programmer

.NET Developer

Infrastructure Engineer - III

Site Reliability Engineer - EP (SE4)

Sitecore Architect/Developer

Jobs in Redmond, Washington, United States

Model Governance Senior Analyst

Lead Product Design Engineer

SAP WM (Full Time)

Production Ops Manager, Marketing Production - UCAN

Director, RISE Content Advisement

Future Customer Experience Mechanical/Thermal Engineer

Product Manager - Community

B2B Sales Associate

Systems Engineer II Broadcast

Web Developer Internship

Network Engineering Jobs

Technical Program Manager, Global Network Infrastructure Engineering and Delivery

Software Developer, Routing & Emulation Graduate - 2024 Start (PhD)

Senior Software Engineer (Stability Platform) - Traffic Infrastructure

Network Production Engineer, Network Infrastructure

Network Engineer

Jr. QA Engineer

Senior Software Engineer, Multi Cloud CDN - San Jose / Seattle / Boston

Sr. Systems Engineer LA/MS

Network Software Development Engineer Graduate (Network Engineering-High Speed Network) - 2025 Start (PhD)

Software Engineer - Datacenter networking

About The Company

Member of Technical Staff, AI Pretraining Platform

Senior Applied Researcher

Senior Applied Scientist

Principal Product Designer

Member of Technical Staff - Backend Engineer, Product

Member of Technical Staff, AI - Multimodal

Member of Technical Staff, AI - Reinforcement Learning Systems

Senior Technical Program Manager, Copilot AI

Executive Business Administrator

Principal Growth Product Manager - Copilot

Level Up Your Career in Game Development!