Senior Site Reliability Engineer

1 Month ago • 10-15 Years
Devops

Job Description

We are looking for a Senior Site Reliability Engineer (SRE) who brings deep technical expertise across Azure administration, SCVMM/Hyper-V, Windows and Linux systems, and application support. You will be responsible for driving reliability, scalability, and performance improvements in both our public and private cloud infrastructure. This role is ideal for someone passionate about infrastructure as code, automation, and building resilient, observable systems, owning end-to-end performance monitoring, incident response, and root cause analysis.
Good To Have:
  • Knowledge of containerization and Kubernetes
  • Strong interpersonal, oral, and written communication and collaboration skills with all levels of management
  • Strong organizational skills including the ability to adapt to shifting priorities and meet frequent deadlines
  • Demonstrated proactive approach to problem-solving with strong judgment and decision-making capability
  • Highly resourceful and collaborative team-player, with the ability to also be independently effective and exude initiative and a sense of urgency
  • Exemplifies our customer-focused, action-oriented, results-driven culture
  • Forward looking thinker, who actively seeks opportunities, has a desire for continuous learning, and proposes solutions
  • Ability to act with discretion and maintain complete confidentiality
  • Dedicated to the firm’s values of non-negotiable integrity, valuing our people, exceeding client expectations, and embracing intellectual curiosity and rigor
Must Have:
  • Design, operate, and scale hybrid cloud environments using Azure and SCVMM/Hyper-V
  • Handle escalations from teams across Global Technology
  • Participate in Problem Management and innovate permanent solutions to recurring issues
  • Use Infrastructure as Code to create repeatable service offerings
  • Own end-to-end performance monitoring, incident response, and root cause analysis (RCA)
  • Collaborate with development and operations teams to define SLIs/SLOs and improve system reliability
  • Drive automation of manual processes, including environment provisioning, configuration management, and deployments
  • Maintain detailed, high-quality documentation for systems, environments, and workflows
  • Expert in Microsoft Azure (IaaS, PaaS, identity, networking, monitoring)
  • Strong experience with System Center Virtual Machine Manager (SCVMM) and Hyper-V
  • Advanced Windows and Linux (Ubuntu) system administration
  • Experience with Splunk, AppDynamics, SolarWinds for monitoring
  • Proficiency in CI/CD Systems like GitLab CI and Octopus Deploy
  • Solid understanding of DNS, firewalls, certificates, identity/access controls
  • Experience supporting production environments, distributed systems, microservices, and REST APIs

Add these skills to join the top 1% applicants for this job

team-management
game-texts
gitlab
networking
dns
linux
incident-response
azure
hyper-v
microsoft-azure
ci-cd
microservices
kubernetes
splunk

We are looking for a Senior Site Reliability Engineer (SRE) who brings deep technical expertise across Azure administration, SCVMM/Hyper-V, Windows and Linux systems, and application support. You will be responsible for driving reliability, scalability, and performance improvements in both our public and private cloud infrastructure.

This role is ideal for someone passionate about infrastructure as code, automation, and building resilient, observable systems.

Years of Experience

10-15+ years of progressive experience in system administration, cloud operations, or infrastructure engineering

Special Requirements

Weekend break/fix or maintenance work may occasionally be required

Responsibilities

  • Design, operate, and scale hybrid cloud environments using Azure and SCVMM/Hyper-V
  • Handle escalations from teams across Global Technology
  • Participate in Problem Management and innovate permanent solutions to recurring issues
  • Use Infrastructure as Code to create repeatable service offerings
  • Own end-to-end performance monitoring, incident response, and root cause analysis (RCA)
  • Collaborate with development and operations teams to define SLIs/SLOs and improve system reliability
  • Drive automation of manual processes, including environment provisioning, configuration management, and deployments
  • Maintain detailed, high-quality documentation for systems, environments, and workflows

Candidate Requirements

  • Cloud Infrastructure: Expert in Microsoft Azure (IaaS, PaaS, identity, networking, monitoring)
  • Virtualization: Strong experience with System Center Virtual Machine Manager (SCVMM) and Hyper-V
  • Operating Systems: Advanced Windows and Linux (Ubuntu) system administration
  • Knowledge of containerization and Kubernetes
  • Monitoring & Observability: Splunk, AppDynamics, SolarWinds
  • CI/CD Systems: GitLab CI, Octopus Deploy
  • Networking & Security: Solid understanding of DNS, firewalls, certificates, identity/access controls
  • Application Support: Experience supporting production environments, distributed systems, microservices, REST APIs

Core Values

  • Strong interpersonal, oral, and written communication and collaboration skills with all levels of management
  • Strong organizational skills including the ability to adapt to shifting priorities and meet frequent deadlines,
  • Demonstrated proactive approach to problem-solving with strong judgment and decision-making capability.
  • Highly resourceful and collaborative team-player, with the ability to also be independently effective and exude initiative and a sense of urgency.
  • Exemplifies our customer-focused, action-oriented, results-driven culture.
  • Forward looking thinker, who actively seeks opportunities, has a desire for continuous learning, and proposes solutions.
  • Ability to act with discretion and maintain complete confidentiality.
  • Dedicated to the firm’s values of non-negotiable integrity, valuing our people, exceeding client expectations, and embracing intellectual curiosity and rigor.

Set alerts for more jobs like Senior Site Reliability Engineer
Set alerts for new jobs by Ruselle Investments
Set alerts for new Devops jobs in India
Set alerts for new jobs in India
Set alerts for Devops (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙