Senior Site Reliability Engineer

1 Month ago • 2-6 Years • DevOps

Job Summary

Job Description

Microsoft's COSMIC team seeks a Senior Site Reliability Engineer (SRE) to maintain their Azure Kubernetes Service-based platform. Responsibilities include keeping platform components updated, debugging issues from upgrades, improving the platform through identifying patterns in service alerts and building auto-remediation solutions, creating dashboards/alerts for issue identification, and collaborating with cross-functional teams to design and ship new features. The ideal candidate will have 2+ years of experience in software engineering, network engineering, or systems administration, experience with Agile methodologies, and strong problem-solving skills. Experience with Azure cloud and Kubernetes is preferred.
Must have:
  • 6+ years experience in software engineering/systems administration or relevant degree and experience
  • Experience with Agile development
  • Maintain platform component updates and debug issues
  • Build auto-remediation solutions
  • Create dashboards/alerts for system health
Good to have:
  • Azure cloud experience
  • Kubernetes knowledge
Perks:
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Job Details

Overview

M365's COSMIC team designs, builds, and operates a global scale managed-runtime environment based on Azure Kubernetes Service for the benefit of Microsoft Substrate service and developers. COSMIC could be compared to a ‘Kubernetes PaaS’. Our charter builds and maintains solutions that enable substrate service teams onboarding to Cosmic Linux platform to focus on their own scenarios and business requirements rather than worrying about common infrastructure components like Deployment, Upgrades, Security, Observability, Debuggability etc.

 

We are looking for Senior Site Reliability Engineer to maintain the health of Cosmic platform by ensuring all the agents are updated, upgrades are happening as per schedule and debug any issues arising out of it. As an SRE, you would need to identify the patterns from the service alerts, add automations to enrich the incidents with metadata as well as build solutions for auto remediation wherever possible.

 

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Required Qualifications:

  • 6+ years technical experience in software engineering, network engineering, or systems administration
    • OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration
    • OR Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.
  • Experience with or exposure to Agile and iterative development processes.

Other Requirements:

 

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  • Cloud and services experience, with Azure cloud experience.
  • Working knowledge on Kubernetes.

 

#M365CORE

Responsibilities

  • Keep the platform components updated incorporating the dependencies from other applications/tech stacks and debug any issues arising out of such upgrades/updates.
  • Continuously improve our platform by identifying patterns in service alerts / incidents and building solutions for auto-remediation.
  • Build dashboard/alerts for faster identification of issues and keeping the system health in check.
  • Collaborate with cross-functional teams to define, design, and ship new features to keep the platform health stable.
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

Similar Jobs

Clarivate - Senior Data Engineer

Clarivate

Karnataka, India (On-Site)
3 Months ago
Aristocrat Gaming - Technical Lead - Full Stack Development

Aristocrat Gaming

Noida, Uttar Pradesh, India (On-Site)
3 Months ago
Extreme Network - Staff/Principal Software Engineer – Edge compute -Containerization 9401

Extreme Network

Toronto, Ontario, Canada (Hybrid)
4 Months ago
Appier - Senior Software Engineer, Data Backend(CrossX)

Appier

Taipei City, Taiwan (On-Site)
3 Months ago
Glean - Infrastructure Support Engineer

Glean

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Saviynt - Engineer/Senior Engineer, Support Operations - IGA

Saviynt

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
Next Level Business Services - Systems Engineer

Next Level Business Services

Redmond, Washington, United States (On-Site)
3 Months ago
Innomotics - IT Solution Engineer

Innomotics

Maharashtra, India (On-Site)
3 Months ago
 Sagecor Solutions - Software Engineer 2 (IDN - 073)

Sagecor Solutions

Annapolis Junction, Maryland, United States (On-Site)
3 Months ago
Gunzilla - DevOps/Build Engineer

Gunzilla

Kyiv, Kyiv City, Ukraine (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

VGW - QA Tester

VGW

Kraków, Lesser Poland Voivodeship, Poland (On-Site)
1 Month ago
Evolution - Site Reliability Engineer / DevOps Engineer

Evolution

Warsaw, Masovian Voivodeship, Poland (Hybrid)
5 Months ago
Microsoft - Principal Software Engineering Manager - Business & Industry Copilots

Microsoft

Dublin, County Dublin, Ireland (On-Site)
1 Month ago
Thatgamecompany - Senior Software Engineer - Golang

Thatgamecompany

United States (Remote)
1 Month ago
Rootergg - Software Development Engineer - Backend

Rootergg

New Delhi, Delhi, India (On-Site)
4 Months ago
ByteDance - Researcher Graduate (Applied Machine Learning - Enterprise) -2025 Start (BS/MS)

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
Axinous - Staff Software Development Engineer (Data Engineering)

Axinous

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Luxoft - Senior DevOps Engineer

Luxoft

Bucharest, Bucharest, Romania (On-Site)
2 Months ago
Playtech - Java Developer

Playtech

Sofia, Sofia City Province, Bulgaria (On_site)
2 Months ago
Microsoft - Member of Technical Staff, High Performance Computing Engineer

Microsoft

Mountain View, California, United States (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Hyderabad, Telangana, India

SSC Technologies - Hiring Investor Services Professionals!

SSC Technologies

Navi Mumbai, Maharashtra, India (On-Site)
3 Months ago
Anko GCC - Senior Engineer

Anko GCC

Bengaluru, Karnataka, India (Hybrid)
4 Months ago
Devrev - Finance Manager

Devrev

Bengaluru, Karnataka, India (On-Site)
2 Months ago
IBM - Application Developer: Content & Courseware Design

IBM

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Intel Corporation - SOC Design Engineer - RTL/Logic

Intel Corporation

Bengaluru, Karnataka, India (Hybrid)
2 Months ago
Luxoft - Senior Power Apps Developer

Luxoft

New Delhi, Delhi, India (Remote)
2 Months ago
Barclays - Analyst - RMG

Barclays

Mumbai, Maharashtra, India (On-Site)
5 Months ago
Guidehouse - Junior Associate AR

Guidehouse

Chennai, Tamil Nadu, India (On-Site)
4 Months ago
NinjaVan - Senior Full Stack Engineer

NinjaVan

Hyderabad, Telangana, India (On-Site)
4 Months ago
Nasdaq - Software Developer - AxiomSL

Nasdaq

Pune, Maharashtra, India (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

ION - Cloud Engineer Kubernetes

ION

Milan, Lombardy, Italy (Hybrid)
4 Months ago
Sourcegraph  Inc  - Software Engineer - Dev Infra [IC3]

Sourcegraph Inc

San Francisco, California, United States (On-Site)
2 Months ago
Voodoo - Senior Cloud Infrastructure Engineer

Voodoo

Paris, Île-de-France, France (Hybrid)
6 Months ago
PwC - Senior Associate_Azure Data Engineer_Data & Analytics_Advisory_PAN  India

PwC

Bengaluru, Karnataka, India (On-Site)
4 Months ago
PwC - SAP Cloud Engineer in Business Technology Team

PwC

Prague, Prague, Czechia (On-Site)
4 Months ago
Telesign - Site Reliability Engineer (SRE) III

Telesign

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Dream Games - SecOps Engineer

Dream Games

İstanbul, Türkiye (On-Site)
7 Months ago
Luxoft - JavaScript Full Stack Engineer

Luxoft

Kuala Lumpur, Federal Territory Of Kuala Lumpur, Malaysia (On-Site)
2 Months ago
Google - Customer Engineer II, API and Integration, Strategic, Cloud

Google

Chicago, Illinois, United States (On-Site)
1 Month ago
Immutable - Senior Site Reliability Engineer

Immutable

Sydney, New South Wales, Australia (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Microsoft is a tech giant that develops, licenses, and supports a range of software products, services, and devices.

London, England, United Kingdom (On-Site)

London, England, United Kingdom (Hybrid)

London, England, United Kingdom (On-Site)

Jakarta, Jakarta, Indonesia (On-Site)

Prague, Prague, Czechia (On-Site)

Montreal, Quebec, Canada (On-Site)

Dublin, County Dublin, Ireland (On-Site)

Hyderabad, Telangana, India (On-Site)

View All Jobs

Get notified when new jobs are added by Microsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug