Site Reliability Engineer

3 Months ago • All levels • DevOps

Job Summary

Job Description

The Site Reliability Engineer (SRE) at Microsoft's Azure Core team will maintain the world's computer, ensuring new servers come online efficiently at hyperscale. Responsibilities involve collaborating with various teams (developers, hardware engineers, datacenter technicians, etc.) to debug and resolve issues, drive continuous improvements, and prevent future problems. This role requires analyzing data to identify problem areas, automating mitigations, and participating in design reviews and problem management. The ideal candidate will have a foundational understanding of distributed systems and experience with programming languages (C, C++, C#, Java). The role involves working with large-scale server and network device management, investigation, and root cause analysis across multiple systems.
Must have:
  • Technical experience in software engineering, network engineering, or systems administration.
  • Distributed systems experience
  • Programming skills (C, C++, C#, Java)
  • Root cause analysis and problem resolution
  • Collaboration with multiple teams
Perks:
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Job Details

Overview

Come build and maintain the world’s computer as a member of the Microsoft Capacity Infrastructure Services team in Azure Core. The team ensures new servers are brought online (capacity buildout) to enable Azure customers to leverage the latest offerings, see the illusion of infinite capacity, and grow the Azure business efficiently at hyperscale.

As a Site Reliability Engineer, you’ll work with a breadth of partners across Microsoft including developers in service teams, hardware engineers, datacenter technicians, supply chain managers, and business leaders to rapidly debug and resolve issues delaying this carefully orchestrated buildout sequence. You’ll drive continuous improvements with these teams to prevent repeats and address common classes of issues across the Azure software stack through design reviews and problem management.

This opportunity will enable you to learn unparalleled system-wide knowledge of how the Azure cloud is built and maintained. The contacts you make with experts will enable you to deep dive on services and new technologies and partner for improvements. You’ll be stretched to automate mitigations tactically and strategically analyze data to identify problem areas for driving prioritization. This role requires flexibility to hold virtual meetings and collaborate with partners worldwide. It supports remote work up to 100% of the time working from home.

 

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Required Qualifications: 

  • Technical experience in software engineering, network engineering, or systems administration.
    • OR Bachelor's Degree in Computer Science, Information Technology, or related field.
  • You must be legally authorised to work in Romania to be eligible for this role (Legallly authorised= has citizenship or has been granted a valid visa or work permit).

 

***Relocation expenses are not provided as part of this role

 

Other Requirements:

  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Additional / Preferred Qualifications: 

  • Distributed systems - developing, debugging, monitoring, and deploying.
  • Programming - C, C++, C#, Java.
  • Systems - hardware and software interface, host and networking, large scale server and network device management, investigation and root cause analysis across multiple systems/services/teams.

 

#Azurecorejobs

Responsibilities

  • Develops a foundational understanding of distributed systems design, interactions between cloud technology layers and components, basic dependencies at scale, and the code that defines infrastructures. Can contribute to the code base the defines components or features of systems or cloud technologies to improve the reliability and operability of supported products, with direction with other engineers.
  • Supports ongoing engagements with product engineering teams by participating in code/design reviews, regular meetings, on-call rotations, and incident responses throughout product development and operations cycles; draws insights from engagements with product engineering teams and basic analyses of telemetry data to propose potential improvements to code and designs for a defined set of product components or features with guidance from other engineers.
  • Implements simple configuration and data changes across a predefined range of product components or features with guidance from other engineers to develop an understanding of how configurations, binaries, and data can be managed using code, tooling, and automation.
  • Develops an understanding of how to safely and reliably manage changes in production by using existing tools and automation to enable product engineering teams implement changes across a defined range of components or features, with direction from other engineers.
  • Uses existing tools to troubleshoot problems or flaws affecting the availability, reliability, performance, and/or efficiency of components or features with guidance from other engineers. Suggests potential solutions to resolve and prevent recurring issues and brings them to the attention of other engineers or team leads.
  • Responds to incidents during regular on-call rotations by identifying the level of impact, troubleshooting basic issues, and deploying appropriate fixes to resolve root cause(s); alerts product teams or owners to major customer impacting issues and escalates the resolution of complex issues and/or those affecting multiple components or features to other engineers as needed. Shares details related to incidents and their resolution through post-mortem reports and during regular review meetings
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Similar Jobs

Blue Yonder - Lead Software Engineer - Performance Engineering

Blue Yonder

Bengaluru, Karnataka, India (On-Site)
7 Months ago
Zuora - Sr Software Engineer

Zuora

Bengaluru, Karnataka, India (Hybrid)
6 Months ago
Omnissa - Staff Data Scientist

Omnissa

Bengaluru, Karnataka, India (Hybrid)
7 Months ago
Glean - Solutions Architect - Central

Glean

(Remote)
4 Months ago
Lirio - Senior Cloud Engineer

Lirio

United States (Remote)
5 Months ago
Microsoft - Technical Support Engineer - Identity & Security (Entra)

Microsoft

Seoul, South Korea (Hybrid)
3 Months ago
Enphase Energy - DevOps Engineer

Enphase Energy

Bengaluru, Karnataka, India (On-Site)
3 Months ago
ARHS - Solution Architect (Data Migration)

ARHS

Stockholm, Stockholm County, Sweden (Remote)
6 Months ago
Nintendo - Senior Manager, Engineering Infrastructure and IT

Nintendo

Redmond, Washington, United States (On-Site)
4 Months ago
ION - Site Reliability Engineer

ION

Pisa, Tuscany, Italy (Hybrid)
6 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Next Level Business Services - Azure Services developer

Next Level Business Services

Redmond, Washington, United States (On-Site)
6 Months ago
Microsoft - Member of Technical Staff - Mobile Technical Lead

Microsoft

Mountain View, California, United States (Hybrid)
3 Months ago
ION - Cloud Engineer Kubernetes

ION

Milan, Lombardy, Italy (Hybrid)
6 Months ago
ByteDance - Software Development Engineer - Distributed NoSQL Database Systems

ByteDance

Seattle, Washington, United States (On-Site)
3 Months ago
Meta - Software Engineer, Android

Meta

Bellevue, Washington, United States (On-Site)
5 Months ago
PwC - Senior Associate | Devops SRE

PwC

Bengaluru, Karnataka, India (On-Site)
6 Months ago
DEVOTEAM - Distributed Cloud l Google Data Project

DEVOTEAM

Lisbon, Lisbon, Portugal (Remote)
6 Months ago
ByteDance - Algorithm Engineer, Security Assurance - 2025 Start

ByteDance

Singapore (On-Site)
6 Months ago
Liquidnitro Games - Software Engineer

Liquidnitro Games

Hyderabad, Telangana, India (On-Site)
5 Months ago
Epic Games - Senior QA Programmer

Epic Games

Vancouver, British Columbia, Canada (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Bucharest, Bucharest, Romania

Amber - Senior Unity Game Engineer (Project Based)

Amber

Bucharest, Bucharest, Romania (On-Site)
9 Months ago
Ness Digital - Senior Generative AI Engineer

Ness Digital

Iași, Iași County, Romania (Remote)
3 Months ago
Microsoft - Technical Support Engineer

Microsoft

Bucharest, Bucharest, Romania (On-Site)
3 Months ago
ASSIST Software - Azure DevOps Engineer

ASSIST Software

Suceava, Suceava County, Romania (Remote)
5 Months ago
Every matrix - Service Manager

Every matrix

Bucharest, Bucharest, Romania (Hybrid)
3 Months ago
PTW - Lead Designer

PTW

Romania (Remote)
5 Months ago
ASSIST Software - Senior Product Designer

ASSIST Software

Suceava, Suceava County, Romania (Remote)
5 Months ago
NXP - Software Intern Student

NXP

Bucharest, Bucharest, Romania (On-Site)
7 Months ago
Amber - Senior Game Designer

Amber

Bucharest, Bucharest, Romania (Hybrid)
1 Month ago
Playtika - 2D Animator

Playtika

Romania (Hybrid)
5 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Rackspace Technology - Presales Enterprise Architect - Multi Service Line

Rackspace Technology

(Remote)
2 Months ago
Microsoft - Senior DPU Software Engineer

Microsoft

Bengaluru, Karnataka, India (On-Site)
3 Months ago
ByteDance - IT Director - Global Payment Brazil

ByteDance

State Of São Paulo, Brazil (On-Site)
5 Months ago
Nagarro - Associate Principal Engineer, QA Automation

Nagarro

Spain (Remote)
6 Months ago
Nagarro - Senior Cloud Consultant

Nagarro

Germany (Remote)
1 Month ago
HiLabs - Lead or Senior Data Scientist

HiLabs

Pune, Maharashtra, India (On-Site)
6 Months ago
Alstom - Engineering Tools Deployment Manager

Alstom

Bengaluru, Karnataka, India (On-Site)
6 Months ago
Microsoft - Senior Technical Program Manager

Microsoft

Prague, Prague, Czechia (On-Site)
3 Months ago
Guardian Life - TechOps Engineer

Guardian Life

Gurugram, Haryana, India (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

Noida, Uttar Pradesh, India (On-Site)

Redmond, Washington, United States (Hybrid)

Hyderabad, Telangana, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Redmond, Washington, United States (Remote)

Cairo, Cairo Governorate, Egypt (On-Site)

Budapest, Hungary (Hybrid)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug