Outscal Logooutscal logo

Senior Site Reliability Engineer

2 Months ago • 5-8 Years • DevOps

Job Summary

Job Description

The Senior Site Reliability Engineer at Microsoft Digital will provide technical leadership to a team, building, running, and improving critical public-sector service environments. Responsibilities include coordinating with engineering teams and business partners, owning deployment and reliability targets, proactively identifying and reducing issues, architecting large-scale systems, driving a culture of resilient architectures, and automating repetitive tasks. The ideal candidate is passionate about distributed systems, scalable services, and building a positive team culture. They will possess strong collaboration skills, expertise in system engineering and automation, and a commitment to operational excellence. The role requires experience with large-scale cloud or distributed systems, improving service infrastructure, and implementing AIOps.
Must have:
  • 8+ years experience in software engineering or related field
  • Software development expertise (C#, C++, Ansible, Python, Java)
  • Experience driving improvements and delivering solutions
  • Technical leadership and collaboration skills
  • Experience with large-scale cloud or distributed systems
Good to have:
  • AIOPS and automation at scale
  • Experience designing, building, and improving service infrastructure
  • Network as code and automation, AIOps in network space
  • Process automation
Perks:
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Job Details

Overview

Overview

 

Microsoft Digital (MSD)’s mission is to power, protect, and transform the employee experience at Microsoft across the globe. Come, build community, explore your passions, pursue your AI and ML aspirations, do your best work and be a part of the team within Microsoft’s Data Platform & Growth (DPG) organization and Experiences & Devices (E+D) division. Microsoft Digital (MSD), is the team that innovates, creates, and delivers the vision for Microsoft’s employee experience, human resources, corporate and legal affairs, global real estate products, and runs Microsoft’s internal network and infrastructure, plus builds campus modernization and hybrid solutions. You will leverage Gen AI, AI, ML, and other topical and latest technologies to focus on empowering Microsoft employees with the tools and services that define both the physical and digital future of work.      

      

Microsoft’s mission is to empower every person and every organization on the planet to achieve more, and we’re dedicated to this mission across every aspect of our company.  is centered on embracing a growth mindset and encouraging teams and leaders to bring their best each day. Join us and help shape the future of the world.       

The Site Reliability Engineering (SRE) team provides leadership, direction and accountability for application architecture, system design, and end-to-end implementation. As Senior Site Reliability Engineer, you build, develop and deliver system improvements using expertise in system engineering, automation, complexity analysis, and scalable system design. Collaboration skills will be required to work closely with other engineering teams to ensure services/systems are highly stable and performant, meeting the expectations of our users.  You provide vision and clarity to team of SREs who build, monitor, and maintain the systems and infrastructure that ensure our customers can quickly access their data and run workloads whenever and wherever they need to. You drive practices and engineering excellence focus to identify service problems and areas for improvement, and we follow up by fixing those problems.   

 
At Microsoft, we can offer you an amazing team, exciting challenges, and a fun place to work. The work environment empowers you to have a positive impact on millions of end users.    

The right candidate for this job (is):  

  • Passionate about distributed systems and working with highly scalable services. 
  • Gains fulfilment developing others and building a positive and collaborative team culture. 
  • Enjoys new technological challenges and is motivated to solve them.
  • Excited about making better software and continuously improving the development, integration, and deployment processes.
  • Smart, highly motivated, self-starter who thrives in a bottoms-up, fast-paced, highly technical environment.  
  • Effective collaborator, experienced in creating technical partnerships across teams. 
  • Passionate about excellence and efficiency in day-to-day operations. 

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

 

Qualifications

Required Qualifications:

  • 8+ years technical experience in software engineering, network monitoring tools, or systems administration
    • OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 5+ years technical experience in software engineering, network engineering, or managing network monitoring tools.
    • OR Master's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or managing network monitoring tools.
  • Current software development expertise in couple of programming languages (C#, C++, Ansible, Shell Scripting, Python, Java, et al) 
  • Proven experience with effectively driving improvement and delivering solutions with stakeholders across all levels of an organization

Other Requirements:


Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  • Exposure to AIOPs and automations at scale. 
  • 8+ years technical experience working with large-scale cloud or distributed systems.
  • Experience designing, building, servicing, and driving ongoing improvement of service infrastructure & systems. 
  • Proven track record of improving reliability, available and performance of cloud services. 
  • Technical understanding of Network as code and automation as well as AIOps in network space. 
  • 6+ years of Software, Site Reliability, Systems, or Service Engineering experience. 

 

 

#MSD #MSDJOBS #EEJOBS 

Accessibility | Microsoft Careers

 

Responsibilities

  • Uphold high organizational standard of great employee and team satisfaction.
  • Agility and adapting as per the changing requirements and aligning investments as per priority. 
  • Provide technical leadership to a team of highly passionate and skilled engineers.
  • Build, run and improve critical public-sector service environments.
  • Coordinate planning and execution with internal engineering teams, business partners and technical leaders across the division.
  • Own deployment, availability, reliability, performance and customer escalation targets for these environments. 
  • Proactive identification and reduction of issues through design, testing, and implementation of software. 
  • Architect and review designs for large scale integrated systems.  
  • Guiding force for Creating and maintaining large scale architectures, looking after the end-to-end experience of the customer, or otherwise exercising excellent engineering judgement on a level above isolated method. 
  • Drive culture of creating resilient architectures capable of surviving the failure of any individual component and painstakingly reconstructing the causal chain of an outage, to figure out how it can be improved. 
  • Identify efficient operations practice and drive culture of automating repetitive tasks with scripting or applications. Experience in Process automation.
  • Managing Network monitoring tools including architecture, deployment and engineering aspects. 
  • Knowledge of AIOps implementation is a plus. 
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Similar Jobs

Netflix - Distributed Systems Engineer (L4) - Data Platform

Netflix

United States (Remote)
4 Months ago
The Walt Disney Company - Principal Machine Learning Engineer, Research - Ad Platforms

The Walt Disney Company

Washington, United States (On-Site)
1 Month ago
ByteDance - Security Engineer (Penetration Tester) - 2025 Start

ByteDance

Singapore (On-Site)
4 Months ago
ByteDance - Software Development Engineer Graduate (Network Monitoring & Alerts) - 2025 Start (PhD)

ByteDance

Seattle, Washington, United States (On-Site)
4 Months ago
CloudHire - Senior Database Engineer

CloudHire

Bengaluru, Karnataka, India (Remote)
5 Months ago
NVIDIA - Data Center System Software Architect, DGX Cloud

NVIDIA

Santa Clara, California, United States (Remote)
2 Months ago
Onward Search - Front-end Engineer

Onward Search

Rochester, Minnesota, United States (Remote)
1 Month ago
PwC - ETIC, OCI Technical Support Engineer - Manager

PwC

Cairo, Cairo Governorate, Egypt (On-Site)
5 Months ago
Netflix - Site Reliability Engineer L5 - Open Connect

Netflix

United States (Remote)
1 Month ago
GoTo Group - Senior Software Engineer - Engineering Platform

GoTo Group

Bengaluru, Karnataka, India (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Epic Games - Technical Director, Machine Learning Programmer

Epic Games

(On-Site)
1 Month ago
Meta - Software Engineer, Infrastructure

Meta

Austin, Texas, United States (Remote)
4 Months ago
Knuddels - Data Scientist*in (m/w/d) - remote oder Karlsruhe

Knuddels

Karlsruhe, Baden-Württemberg, Germany (Remote)
8 Months ago
PlayStation Global - Manager - Data Engineering

PlayStation Global

Berlin, Berlin, Germany (On-Site)
3 Months ago
GoTo Group - Principal Engineer - Transport

GoTo Group

Bengaluru, Karnataka, India (On-Site)
5 Months ago
ION - Technical Consultant - Endur

ION

Jersey City, New Jersey, United States (On-Site)
5 Months ago
NVIDIA - Senior SRE Software Engineer, Storage and Data

NVIDIA

Taipei City, Taiwan (On-Site)
2 Months ago
Saviynt - Sr.Principal Engineer, Software Engineering

Saviynt

Bengaluru, Karnataka, India (On-Site)
5 Months ago
Warner Bros Games - Staff Software Engineer - Backend (Adtech Team)

Warner Bros Games

Pune, Maharashtra, India (Hybrid)
1 Month ago
Employ - Senior Software Engineer

Employ

Bengaluru, Karnataka, India (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Hyderabad, Telangana, India

Analog Devices - CAD Engineer

Analog Devices

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Index Exchange - Staff Software Engineer

Index Exchange

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Rivos - SOC Electrical Analysis Engineer - Full Time

Rivos

Bengaluru, Karnataka, India (Hybrid)
5 Months ago
Coursera - Senior Product Designer, Core Foundation

Coursera

India (Remote)
2 Months ago
TechBiz Global - Motion Graphics Designer (Relocation to UAE)

TechBiz Global

New Delhi, Delhi, India (On-Site)
10 Months ago
Entrata - Sr. Product Designer

Entrata

Pune, Maharashtra, India (Hybrid)
5 Months ago
CloudHire - Angular NestJS Architect

CloudHire

Bengaluru, Karnataka, India (Remote)
5 Months ago
PwC - IN-Senior Manager – ERP - Sales-Ms Dynamics– Advisory  - Gurgaon

PwC

Gurugram, Haryana, India (On-Site)
5 Months ago
Imagineio - Lighting & Shading Artist

Imagineio

Delhi, India (On-Site)
2 Months ago
Zones - Bid Architect

Zones

Noida, Uttar Pradesh, India (On-Site)
4 Weeks ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

SmileGate - System Engineer (Private Cloud)

SmileGate

Seongnam-si, Gyeonggi-do, South Korea (On-Site)
2 Months ago
Dream Sports - Lead Engineer - Cloud Security

Dream Sports

Mumbai, Maharashtra, India (On-Site)
7 Months ago
Wargaming - DevOps Engineer

Wargaming

Belgrade, Serbia (On-Site)
3 Months ago
Luxoft - Senior Software Support Engineer

Luxoft

Poland, Ohio, United States (Remote)
4 Months ago
Visa - Staff Data Engineer

Visa

Warsaw, Masovian Voivodeship, Poland (Hybrid)
4 Months ago
N-iX - Senior Azure DevOps Engineer

N-iX

Poland (Remote)
2 Months ago
Dentsu - Lead Solutions Architect

Dentsu

Kansas, United States (Remote)
1 Month ago
Luxoft - Orchestrade - Azure infrastructure cloud Senior engineer

Luxoft

Poland, Ohio, United States (Remote)
4 Months ago
ByteDance - Backend Software Engineer (BABI) - ByteCloud

ByteDance

Singapore (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Redmond, Washington, United States (Hybrid)

London, England, United Kingdom (On-Site)

Redmond, Washington, United States (On-Site)

Mountain View, California, United States (On-Site)

Mountain View, California, United States (On-Site)

Mountain View, California, United States (Hybrid)

Mountain View, California, United States (Hybrid)

Mountain View, California, United States (Hybrid)

Redmond, Washington, United States (On-Site)

London, England, United Kingdom (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug