Site Reliability Engineer

1 Day ago • 1-6 Years • Operations • DevOps • $98,300 PA - $208,800 PA

Job Summary

Job Description

Microsoft's Cloud+AI Silver Team seeks a Site Reliability Engineer to deploy and operate a Secure Work Area in an airgapped environment. This role involves working with engineers enabling Azure services for internal/external customers in highly secured industries, meeting stringent security requirements. Responsibilities include on-call monitoring, automation development, ensuring security and compliance, and collaborating with cross-functional teams. The ideal candidate will possess strong problem-solving skills, experience with large-scale distributed systems, and a commitment to production reliability.
Must have:
  • 4+ years experience in software/network engineering or systems administration
  • 2 years experience with large-scale distributed services and on-call responsibilities
  • Ability to meet Microsoft's security screening requirements
  • Ownership of end-to-end project lifecycle
  • Strong communication & project management skills
Good to have:
  • 2+ years experience with PowerShell, C#, or C++
  • Experience building and influencing towards common goals

Job Details

Overview

Microsoft has an exciting opportunity for a Site Reliability Engineer in the Cloud+AI Silver Team. This team will be responsible for deploying and operating a Secure Work Area, including the infrastructure for collaboration within an airgapped environment. 


In this role, you will have the opportunity to work with engineers who enable a broad set of Azure services to be consumed by internal and external customers in highly secured and regulated industries. The systems and software you build will be required to meet the security policy and assurance requirements of both public and private sector customers.  
  
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. 

Qualifications

Required/Minimum Qualifications:

 

  • 4+ years technical experience in software engineering, network engineering, or systems administration
    • OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration
    • OR Master's Degree in Computer Science, Information Technology, or related field.
  • 2years of experience working on large-scale distributed services with on-call responsibilities. 

 

 

 

Other Requirements:

  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to, the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

 

Preferred Qualifications:

  • 2+ years of experience with PowerShell, C#, or C++. 
  • Ability to build and influence broadly towards common goals and priorities. 
  • Ownership of end-to-end project lifecycle with solid project management and communication skills. 

Site Reliability Engineering IC3 - The typical base pay range for this role across the U.S. is USD $98,300 - $193,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $127,200 - $208,800 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Microsoft will accept applications for the role until May 9, 2025.

 

#Silver

Responsibilities

The scale of our operations is enormous. Microsoft's products and services are overwhelmingly consumed online, and billions of people use them every day. We need people who enjoy analyzing complicated problems, coming up with creative solutions, working in focused teams to build things no-one has thought of before, all in the service of production reliability.   


Acts as a Designated Responsible Individual (DRI) working on call to monitor service for degradation, downtime, or interruptions. Alerts stakeholders as to the status and gains approval to restore system/product/service for simple problems. Responds within Service Level Agreement (SLA) timeframe. Escalates issues to appropriate owners.

Contributes to the development of automation within production and deployment of a complex product feature. Runs code in simulated, or other non-production environments to confirm functionality and error-free runtime for products with little to no oversight.

Contributes to efforts to ensure the correct processes are followed to achieve a high degree of security, privacy, safety, and accessibility. Checks for visible evidence to demonstrate compliance for product areas. Develops and holds an understanding of the implications of onboarding new technologies following expectations of compliance at Microsoft.

Remains current in skills by investing time and effort into staying abreast of current developments that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale.

Applies best practices to reliably build code that is based on well-established methods. Follows best practices for product development and scaling to customer requirements and applies best practices for meeting scaling needs and performance expectations.

Maintains communication with key partners across the Microsoft ecosystem of engineers. Considers partners across teams and their end goals for products to drive and achieve desirable user experiences and fitting the dynamic needs of partners/customers through product development.

Maintains operations of live service as issues arise on a rotational, on-call basis. Implements solutions and mitigations to more complex issues impacting performance or functionality of Live Site service and escalates as necessary. Reviews and writes issues postmortem and shares insights with the team.

Similar Jobs

PwC - IN-Manager_Delivery Manager_Data & Analytics_Advisory_Mumbai

PwC

Mumbai, Maharashtra, India (On-Site)
7 Months ago
DEVOTEAM - Ingénieur Cloud Azure H/F

DEVOTEAM

Levallois-Perret, Île-de-France, France (Remote)
6 Months ago
ByteDance - Cloud Network Engineer

ByteDance

Ashburn, Virginia, United States (On-Site)
2 Weeks ago
ByteDance - Cloud Network Engineer

ByteDance

Seattle, Washington, United States (On-Site)
2 Weeks ago
Ajmera Infotech - React Developer

Ajmera Infotech

Gujarat, India (On-Site)
5 Days ago
Tesla - Megapack Sales Operations Project Manager, Energy EMEA

Tesla

North Holland, Netherlands (On-Site)
2 Months ago
Hawk Eye Innovations - Match Operations Assistant

Hawk Eye Innovations

Edinburgh, Scotland, United Kingdom (On-Site)
1 Week ago
Rank group - Team Leader

Rank group

Wednesbury, England, United Kingdom (On-Site)
5 Months ago
Google - Technical Program Manager, Front End Planning, Third-Party Data Centers

Google

Atlanta, Georgia, United States (On-Site)
1 Day ago
The Walt Disney Company - Specialist, Merchandise E-commerce

The Walt Disney Company

Hong Kong (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Casumo - Quality Assurance Specialist

Casumo

Zagreb, Croatia (Hybrid)
4 Months ago
Brillio - Senior Data Specialist- R01531001

Brillio

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
ION - Cloud Engineer/Architect (DevOps)

ION

Pisa, Tuscany, Italy (On-Site)
6 Months ago
PeopleFun - Principal Game Server Engineer, Wordscapes

PeopleFun

United States (Remote)
1 Month ago
Mixmob - Senior Full-Stack React/Node & NFT Gaming Developer

Mixmob

Vancouver, British Columbia, Canada (Remote)
9 Months ago
Zazz - Cybersecurity Analyst

Zazz

(Remote)
2 Months ago
NVIDIA - Senior Solutions Architect - Generative AI

NVIDIA

Bengaluru, Karnataka, India (On-Site)
3 Months ago
DEVOTEAM - IT Traineeship - Data (Dutch speaking)

DEVOTEAM

Amsterdam, North Holland, Netherlands (On-Site)
6 Months ago
Microsoft - Industrial Control Systems Engineer

Microsoft

Gavleborg, Gavleborg County, Sweden (On-Site)
1 Week ago
The Walt Disney Company - Sr Software Engineer

The Walt Disney Company

Los Angeles, California, United States (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in Redmond, Washington, United States

Zones - Client Solution Architect

Zones

Illinois, United States (Remote)
1 Month ago
Inworld AI - Senior Software Development Engineer in Test (SDET) – Game Engine SDKs - USA

Inworld AI

Mountain View, California, United States (On-Site)
6 Months ago
Google - UX Designer, Customer Engagement, High Touch Support

Google

Irvine, California, United States (On-Site)
1 Day ago
Google - Electronics Technician

Google

Chicago, Illinois, United States (On-Site)
1 Week ago
ByteDance - Tech Lead Machine Learning Engineer

ByteDance

Seattle, Washington, United States (On-Site)
1 Month ago
ByteDance - ISP/Display Firmware Prototype Engineer

ByteDance

San Jose, California, United States (On-Site)
1 Month ago
Google - Security Manager, Google Data Centers

Google

Cedar Rapids, Iowa, United States (On-Site)
1 Day ago
Riot Games - Art Outsourcing II (Weapons Concept)

Riot Games

Los Angeles, California, United States (On-Site)
2 Weeks ago
Google - Technical Program Manager III, Fiber Design, Cloud Networking

Google

Addison, Texas, United States (On-Site)
1 Day ago
Nintendo - CONTRACT - Localization Product Specialist III (Spanish)

Nintendo

Redmond, Washington, United States (Hybrid)
5 Months ago

Get notifed when new similar jobs are uploaded

Operations Jobs

Welevel - Working Student: Operations Management

Welevel

Munich, Bavaria, Germany (On-Site)
1 Month ago
The Walt Disney Company - Assistant Store Manager

The Walt Disney Company

Minato City, Tokyo, Japan (On-Site)
3 Months ago
Hawk Eye Innovations - Match Operations Assistant - Belfast

Hawk Eye Innovations

Belfast, Northern Ireland, United Kingdom (On-Site)
1 Week ago
Tesla - Mobile Service Technician

Tesla

Rogaland, Norway (On-Site)
2 Months ago
Windranger Labs - Customer Service Representative – KYC & Compliance

Windranger Labs

Apac, Northern Region, Uganda (Remote)
1 Week ago
Aristocrat Gaming - Service Ops Major Incident Manager

Aristocrat Gaming

Sofia, Sofia City Province, Bulgaria (Hybrid)
1 Month ago
Dream Sports - Senior Security Engineer - Security Operations

Dream Sports

Mumbai, Maharashtra, India (On-Site)
6 Months ago
Tesla - Compliance Operations Manager, Money Laundering Reporting Officer UK

Tesla

London, England, United Kingdom (On-Site)
2 Months ago
ByteDance - Destination Service Manager - EMEA

ByteDance

London, England, United Kingdom (On-Site)
1 Month ago
Keywords Studios - IT Senior Support Manager

Keywords Studios

Community Of Madrid, Spain (Hybrid)
1 Week ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

London, England, United Kingdom (On-Site)

Redmond, Washington, United States (On-Site)

Redmond, Washington, United States (Hybrid)

Shanghai, Shanghai, China (Hybrid)

Beijing, Beijing, China (On-Site)

Washington, United States (On-Site)

Phoenix, Arizona, United States (On-Site)

Penang, Malaysia (On-Site)

London, England, United Kingdom (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug