Associate Manager - SRE

2 Months ago • 6-9 Years • DevOps • Undisclosed

About the job

Job Description

The Associate Manager - SRE is responsible for managing and monitoring MS Power BI and associated systems. Responsibilities include event management (setting up monitoring tools, optimizing dashboards, generating reports), incident management (responding to incidents, performing RCA, implementing solutions), collaboration (communicating with stakeholders, participating in on-call rotations), change & release management (executing service introduction, deploying products), knowledge management (documenting processes), and continual service improvement. The role requires experience with Azure, Power BI, Azure Data Factory, Databricks, UiPath, scripting languages (Python, PowerShell, Bash), and monitoring tools (Azure Monitor, Prometheus, Grafana).
Must have:
  • Experience with Power BI & Azure
  • Incident & Event Management
  • Scripting (Python, PowerShell)
  • Monitoring tools (Azure Monitor)
  • RCA & problem-solving
  • Collaboration & Communication
Good to have:
  • Azure certifications
  • CI/CD experience
  • ITIL familiarity

About the job

Overview

Event Management:

 Set up and manage monitoring tools to track MS Power BI and downstream Application performance and

health.

 Monitor, maintain, and optimize Power BI dashboards to ensure they are functioning correctly and efficiently.

 Generate reports and provide insights performance, incidents, and improvements.

 Collaborate with cross-functional teams to implement preventive measures and address emerging concerns

promptly.

Incident Management:

 Respond to and manage incidents related to Power BI and associated downstream systems, investigate and

track incidents to resolution in a timely manner and within predefined SLAs.

 Perform Root Cause Analysis (RCA) to the underlying causes of issues. Implement long-term solutions to

prevent recurrence of incidents.

 Support and maintain Azure Data Factory pipelines, ensuring data ingestion and transformation processes run

smoothly.

 Monitor and troubleshoot Databricks environments, optimizing performance and resolving any issues that

arise.

 Manage and maintain UiPath automation workflows, ensuring they operate reliably and efficiently.

 Execute and document post-incident summaries, root cause analysis and mitigation protocols to lessen the

likelihood of repeat incidents.

Collaboration and Communication:

 Execute the communication of incidents to relevant stakeholders, relaying information on business impact,

risks, prioritization, mitigation, and estimated time to resolution.

 Participate in on-call rotations to provide timely responses to production incidents and contribute to swift

issue resolution.

Change & Release Management:

 Execute Service Introduction & Service Acceptance process, to validate and test the Business Application (MS

Power BI & digital products) prior to production deployment / redeployment.

 Deployment of products and enhancements with minimal disruption to production systems.

Knowledge Management:

 Documentation of processes, procedures, standards, and SLAs of S&T BI & Reporting Services in Service

Knowledge Management System (SKMS)

Continual Service Improvement:

 Continually seek opportunities for improvement, automate repetitive tasks and reduce manual intervention.

Responsibilities

Technical Skills:

 Candidate must have experience with monitoring and logging tools such as Azure Monitor, Prometheus,

Grafana, or similar.

 Strong understanding of cloud platforms, particularly Microsoft Azure.

 Proficiency in scripting languages such as Python, PowerShell, or Bash.

Soft Skills:

 Ability to work is a fast paced, agile environment with large cross-functional teams.

 Ability to manage multiple priorities at the same time.

 Strong problem-solving skills and the ability to work under pressure.

 Excellent interpersonal and communication skills, both written and verbal

 Attention to detail and a proactive approach to identifying and resolving issues.

Qualifications

Qualification:

 Degree in Computer Science, Computer Engineering, or related field preferred

Experience:

 6-9 years of experience; with minimum 3+ years of experience in Site Reliability (SRE roles) / IT Application

Support role

 Candidates must have strong background in supporting and managing MS Power BI Application and hands on

experience with least one of the specified technologies (Azure Data Factory, Databricks, Uipath).

 Candidate must have proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role.

 Experience in Developing and implementing automation scripts and tools to improve Application reliability

and operational efficiency.

 Candidate must be willingness to be an integral part of the Production Support team, to work in UK, US shift

hours and weekend shift on rotation.

 Candidate must demonstrate a willingness to learn and adapt to new technologies as needed.

Preferred Qualifications:

 Certifications in Azure, Power BI, or related technologies.

 Experience with CI/CD pipelines and infrastructure as code (IaC) tools.

 Familiarity with ITIL practices and principles.

View Full Job Description

Add your resume

80%

Upload your resume, increase your shortlisting chances by 80%

About The Company

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Kosi, Uttar Pradesh, India (On-Site)

View All Jobs

Get notified when new jobs are added by PepsiCo

Similar Jobs

IGT - Temporary Systems Administrator

IGT, United States (On-Site)

King - Senior Cloud Security Engineer

King, Spain (On-Site)

Pattern® - Senior Site Reliability Engineer

Pattern®, India (On-Site)

Adtran - Software Engineer (Devops)

Adtran, India (On-Site)

Columbia Sportswear Company - Azure Cloud Developer/Engineer

Columbia Sportswear Company, India (Hybrid)

Info Stretch - Lead Data Engineer

Info Stretch, India (On-Site)

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

ION - Cloud Engineer Kubernetes

ION, Italy (Hybrid)

Rush Street Interactive - Threat Intelligence Analyst

Rush Street Interactive, Estonia (On-Site)

Playrix - Senior Release Engineer

Playrix, Ukraine (Remote)

Mighty Games - IT Support

Mighty Games, Australia (On-Site)

Wargaming - DevOps Engineer (Platform Team)

Wargaming, Serbia (Hybrid)

IGT - Temporary Systems Administrator

IGT, United States (On-Site)

Coupa Software - Software Engineer in Test

Coupa Software, Colombia (Hybrid)

AbZorba Games  - Dev Ops Engineer

AbZorba Games , Greece (On-Site)

Trend Micro - (Sr.) Software Engineer in Test

Trend Micro, Taiwan (On-Site)

PwC - Application Security Manager

PwC, Philippines (On-Site)

Get notifed when new similar jobs are uploaded

Jobs in Hyderabad, Telangana, India

PhonePe - System Integrator

PhonePe, India (On-Site)

GreedyGame - TECHNICAL RECRUITER

GreedyGame, India (On-Site)

BrightEdge - Staff Engineer

BrightEdge, India (Remote)

Neur XR Studios - 3D Artist

Neur XR Studios, India (On-Site)

Kaleidoscope - 3D Artist

Kaleidoscope, India (On-Site)

PwC - Senior Associate

PwC, India (On-Site)

Notchup (Formerly CodeMonkAI) - Senior Quality Assurance Automation Engineer

Notchup (Formerly CodeMonkAI), India (Remote)

Get notifed when new similar jobs are uploaded

DevOps Jobs

The Walt Disney Company - Sr Site Reliability Engineer

The Walt Disney Company, United States (On-Site)

SparkCognition - DevOps Engineer

SparkCognition, India (On-Site)

Telesign - Site Reliability Engineer (SRE) III

Telesign, India (On-Site)

Microsoft - Principal Researcher

Microsoft, Canada (On-Site)

Hitachi - Solution Architect

Hitachi, Costa Rica (On-Site)

Nagarro - Senior Engineer, Cloud

Nagarro, India (On-Site)

Bazaar Voice - Staff MLOps Engineer

Bazaar Voice, United Kingdom (Hybrid)

Get notifed when new similar jobs are uploaded