Sr. Observability Engineer

10 Minutes ago • 7 Years + • $147,385 PA - $185,235 PA
Devops

Job Description

As a Senior Observability Engineer within UMG’s IT Technology Services team, you will drive the reliability, performance, and stability of our global technology ecosystem. You’ll own the design and evolution of our observability platform, ensuring visibility across systems, applications, and services. This role is both hands-on and strategic, focusing on building scalable, automated, and data-driven monitoring solutions. You will partner with various teams to lead observability best practices and shape a culture of proactive system insight across UMG.
Good To Have:
  • Prometheus Certified Admin
  • Kubernetes Administrator or Application Developer
  • Grafana Certified Observer
  • Dynatrace Associate
  • Splunk Core Certified Power User/Admin
  • Elastic Certified Engineer
  • DevOps Engineer Certification (AWS and/or Google)
Must Have:
  • Design, implement, and maintain end-to-end observability solutions
  • Select, configure, and integrate industry-leading monitoring and telemetry tools
  • Develop automation and integrations for metrics, logging, and tracing pipelines
  • Establish effective alerting frameworks and SLO/SLA-driven dashboards
  • Partner with incident response and SRE teams to diagnose and remediate issues
  • Conduct root cause analysis and identify performance bottlenecks
  • Collaborate with development, security, and operations teams on observability
  • Lead cross-functional initiatives to standardize monitoring practices
  • Mentor peers and provide training on observability tools and best practices
  • Evaluate emerging technologies to evolve observability strategy
  • Drive automation and process improvements for system performance and resiliency
  • Integrate observability with security monitoring and compliance workflows
  • Analyze metrics, logs, and traces for system behavior insights
  • Deliver reports and visualizations for technical and business stakeholders
  • 7+ years of professional IT experience, 3+ years in observability, monitoring, or SRE
  • Deep knowledge of monitoring toolsets (Prometheus, Grafana, ELK, Splunk, Dynatrace, Datadog)
  • Proficiency in Python, Go, or Java for automation and tool development
  • Hands-on experience with Kubernetes, Docker, and cloud platforms (AWS, GCP, Azure)
  • Strong understanding of networking, infrastructure, and performance optimization
  • Familiarity with configuration management tools (Ansible, Chef, Puppet) and CI/CD integration
  • Proven track record designing and delivering dashboards, alerts, and performance reports
Perks:
  • Be part of an entrepreneurial, global organization
  • Comprehensive medical, dental, vision, and FSA options
  • 100% coverage for out-patient mental health services
  • Wellbeing reimbursements (fitness classes, spa treatments, meal services, travel up to $720/year)
  • Lifetime fertility support allowance of $30,000
  • Student Loan Repayment Assistance and Tuition Reimbursement
  • 100% immediately vested 401(k) match on the first 5% of contribution
  • Flexible Paid Time Off (PTO) for exempt employees
  • 3-weeks PTO for non-exempt employees
  • 2-weeks paid Winter Break
  • 10 Company Holidays (including Juneteenth and Wellbeing Day)
  • Summer Fridays (between Memorial Day and Labor Day)
  • Generous paid parental leave for every type of parent

Add these skills to join the top 1% applicants for this job

cross-functional
communication
data-analytics
oops
game-texts
networking
incident-response
aws
azure
prometheus
ansible
grafana
chef
elk
puppet
ci-cd
docker
kubernetes
python
splunk
java
system-design

How You'll LEAD:

As a Senior Observability Engineer within UMG’s IT Technology Services team, you will drive the reliability, performance, and stability of our global technology ecosystem. You’ll own the design and evolution of our observability platform, ensuring visibility across systems, applications, and services.

This role is both hands-on and strategic — ideal for an engineer passionate about building scalable, automated, and data-driven monitoring solutions that empower teams to deliver high-performing, resilient systems. You’ll partner with DevOps, Infrastructure, and Application teams to lead observability best practices and shape a culture of proactive system insight across UMG.

How you'll CREATE:

Observability Architecture & Implementation

  • Design, implement, and maintain end-to-end observability solutions across infrastructure, applications, and services.
  • Select, configure, and integrate industry-leading monitoring and telemetry tools (e.g., Prometheus, Grafana, ELK, Dynatrace, Datadog).
  • Develop automation and integrations to streamline metrics, logging, and tracing pipelines.

Monitoring & Incident Response

  • Establish effective alerting frameworks and SLO/SLA-driven dashboards for real-time visibility.
  • Partner with incident response and SRE teams to diagnose, remediate, and prevent production issues.
  • Conduct root cause analysis and proactively identify performance bottlenecks and capacity needs.

Collaboration & Leadership

  • Partner with development, security, and operations teams to embed observability into system design.
  • Lead cross-functional initiatives to standardize monitoring practices and enhance operational maturity.
  • Mentor peers and provide training on observability tools and best practices.

Continuous Improvement

  • Evaluate emerging technologies to evolve UMG’s observability strategy.
  • Drive automation and process improvements to improve system performance, resiliency, and insight quality.
  • Integrate observability with security monitoring and compliance workflows.

Data Analysis & Reporting

  • Analyze metrics, logs, and traces to surface insights into system behavior and performance trends.
  • Deliver reports and visualizations tailored for both technical and business stakeholders.

Bring your VIBE:

Required Skills & Experience

  • 7+ years of professional experience in information technology, including 3+ years specializing in observability, monitoring, or SRE engineering.
  • Deep knowledge of monitoring toolsets such as Prometheus, Grafana, ELK, Splunk, Dynatrace, Datadog, or equivalent.
  • Proficiency in Python, Go, or Java for automation and tool development.
  • Hands-on experience with Kubernetes, Docker, and cloud platforms (AWS, GCP, or Azure).
  • Strong understanding of networking, infrastructure, and performance optimization.
  • Familiarity with configuration management tools (Ansible, Chef, Puppet) and CI/CD integration.
  • Proven track record designing and delivering dashboards, alerts, and performance reports for multiple audiences.
  • Excellent communication skills, with the ability to translate technical insights into actionable recommendations.

Preferred Certifications (Highly Desirable)

  • Prometheus Certified Admin
  • Kubernetes Administrator or Application Developer
  • Grafana Certified Observer
  • Dynatrace Associate
  • Splunk Core Certified Power User/Admin
  • Elastic Certified Engineer
  • DevOps Engineer Certification (AWS and/or Google)

Perks Playlist:

  • Be part of an entrepreneurial, global organization that values authenticity, drive, creativity, relationships, and a competitive spirit
  • Comprehensive medical, dental, vision, and FSA options, as well as:
  • 100% coverage for out-patient mental health services
  • Wellbeing reimbursements for fitness classes, spa treatments, meal services, travel, and so much more (up to $720/year)
  • A lifetime fertility support allowance of $30,000 to plan participants
  • Student Loan Repayment Assistance and Tuition Reimbursement
  • 100% immediately vested 401(k) match on the first 5% of your contribution on eligible compensation
  • Variety of ways to prioritize much-needed time away from work including:
  • Flexible Paid Time Off (PTO) for exempt employees
  • 3-weeks PTO for non-exempt employees
  • 2-weeks paid Winter Break
  • 10 Company Holidays (including Juneteenth and Wellbeing Day)
  • Summer Fridays (between Memorial Day and Labor Day)
  • Generous paid parental leave for every type of parent

Disclaimer: This job description only provides an overview of job responsibilities that are subject to change.

Set alerts for more jobs like Sr. Observability Engineer
Set alerts for new jobs by Universal Music Group
Set alerts for new Devops jobs in United States
Set alerts for new jobs in United States
Set alerts for Devops (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙