Site Reliability Engineer

12 Hours ago • 7-11 Years

Job Summary

Job Description

This Site Reliability Engineer role involves managing and maintaining complex, distributed big data ecosystems to ensure reliability, scalability, and security. The role includes automating processes, optimizing workflows, troubleshooting production issues, and driving system improvements across multiple business verticals. Responsibilities include managing Linux/Unix environments, leading on-call rotations, designing automation systems for big data infrastructure, troubleshooting production issues, designing and reviewing system architectures, and enforcing security standards. The engineer will also be responsible for monitoring and optimizing system performance, developing tools for automation, and collaborating with development teams to integrate best practices.
Must have:
  • Experience in managing and maintaining distributed big data ecosystems for over 6 years.
  • Strong expertise in Linux, including IP, Iptables, and IPsec.
  • Proficiency in scripting/programming with languages like Perl, Golang, or Python.
  • Hands-on experience with the Hadoop stack (HDFS, HBase, Airflow, YARN, Ranger, Kafka, Pinot).
  • Familiarity with open-source configuration management and deployment tools such as Puppet, Salt, Chef, or Ansible.
  • Solid understanding of networking, open-source technologies, and related tools.
  • Excellent communication and collaboration skills.
  • Experience with DevOps tools: Saltstack, Ansible, Docker, Git.
  • Experience with SRE logging and monitoring tools: ELK stack, Grafana, Prometheus, opentsdb, Open Telemetry.
Good to have:
  • Experience managing infrastructure on public cloud platforms (AWS, Azure, GCP).
  • Experience in designing and reviewing system architectures for scalability and reliability.
  • Experience with observability tools to visualize and alert on system performance.
Perks:
  • Insurance Benefits (Medical, Critical Illness, Accidental, Life)
  • Wellness Program (Employee Assistance Program, Onsite Medical Center, Emergency Support System)
  • Parental Support (Maternity Benefit, Paternity Benefit Program, Adoption Assistance Program, Day-care Support Program)
  • Mobility Benefits (Relocation benefits, Transfer Support Policy, Travel Policy)
  • Retirement Benefits (Employee PF Contribution, Flexible PF Contribution, Gratuity, NPS, Leave Encashment)
  • Other Benefits (Higher Education Assistance, Car Lease, Salary Advance Policy)

Job Details

About PhonePe Group: 

PhonePe is India’s leading digital payments company with 50 crore (500 Million) registered users and 3.7 crore (37 Million) merchants covering over 99% of the postal codes across India. On the back of its leadership in digital payments, PhonePe has expanded into financial services (Insurance, Mutual Funds, Stock Broking, and Lending) as well as adjacent tech-enabled businesses such as Pincode for hyperlocal shopping and Indus App Store which is India's first localized App Store. The PhonePe Group is a portfolio of businesses aligned with the company's vision to offer every Indian an equal opportunity to accelerate their progress by unlocking the flow of money and access to services.

Culture

At PhonePe, we take extra care to make sure you give your best at work, Everyday! And creating the right  environment for you is just one of the things we do. We empower people and trust them to do the right  thing. Here, you own your work from start to finish, right from day one. Being enthusiastic about tech is a  big part of being at PhonePe. If you like building technology that impacts millions, ideating with some of  the best minds in the country and executing on your dreams with purpose and speed, join us!

About the Role:

This role is responsible for managing and maintaining complex, distributed big data ecosystems. It ensures the reliability, scalability, and security of large-scale production infrastructure. Key responsibilities include automating processes, optimizing workflows, troubleshooting production issues, and driving system improvements across multiple business verticals.

Roles and Responsibilities:

  • Manage, maintain, and support incremental changes to Linux/Unix environments.
  • Lead on-call rotations and incident responses, conducting root cause analysis and driving postmortem processes.
  • Design and implement automation systems for managing big data infrastructure, including provisioning, scaling, upgrades, and patching clusters.
  • Troubleshoot and resolve complex production issues while identifying root causes and implementing mitigating strategies.
  • Design and review scalable and reliable system architectures.
  • Collaborate with teams to optimize overall system performance.
  • Enforce security standards across systems and infrastructure.
  • Set technical direction, drive standardization, and operate independently.
  • Ensure availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.
  • Resolve, analyze, and respond to system outages and disruptions and implement measures to prevent similar incidents from recurring.
  • Develop tools and scripts to automate operational processes, reducing manual workload, increasing efficiency and improving system resilience.
  • Monitor and optimize system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning.
  • Collaborate with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle.
  • Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities.
  • Develop and enforce SRE best practices and principles.
  • Align across functional teams on priorities and deliverables.
  • Drive automation to enhance operational efficiency.

Skills Required:

  • Over 6 years of experience managing and maintaining distributed big data ecosystems.
  • Strong expertise in Linux including IP, Iptables, and IPsec.
  • Proficiency in scripting/programming with languages like Perl, Golang, or Python.
  • Hands-on experience with the Hadoop stack (HDFS, HBase, Airflow, YARN, Ranger, Kafka, Pinot).
  • Familiarity with open-source configuration management and deployment tools such as Puppet, Salt, Chef, or Ansible.
  • Solid understanding of networking, open-source technologies, and related tools.
  • Excellent communication and collaboration skills.
  • DevOps tools: Saltstack, Ansible, docker, Git.
  • SRE Logging and monitoring tools: ELK stack, Grafana, Prometheus, opentsdb, Open Telemetry.

Good to Have:

  • Experience managing infrastructure on public cloud platforms (AWS, Azure, GCP).
  • Experience in designing and reviewing system architectures for scalability and reliability.
  • Experience with observability tools to visualize and alert on system performance.

PhonePe Full Time Employee Benefits (Not applicable for Intern or Contract Roles)

  • Insurance Benefits - Medical Insurance, Critical Illness Insurance, Accidental Insurance, Life Insurance
  • Wellness Program - Employee Assistance Program, Onsite Medical Center, Emergency Support System
  • Parental Support - Maternity Benefit, Paternity Benefit Program, Adoption Assistance Program, Day-care Support Program
  • Mobility Benefits - Relocation benefits, Transfer Support Policy, Travel Policy
  • Retirement Benefits - Employee PF Contribution, Flexible PF Contribution, Gratuity, NPS, Leave Encashment 
  • Other Benefits - Higher Education Assistance, Car Lease, Salary Advance Policy

Working at PhonePe is a rewarding experience! Great people, a work environment that thrives on creativity, the opportunity to take on roles beyond a defined job description are just some of the reasons you should work with us. Read more about PhonePe on our blog.

Life at PhonePe

PhonePe in the news

Similar Jobs

Synechron - PySpark Developer

Synechron

Bengaluru, Karnataka, India (On-Site)
1 Week ago
Rackspace Technology - R-19462 Data Engineer III - VN

Rackspace Technology

Vietnam (Remote)
4 Months ago
Canonical - Cloud Field Engineer

Canonical

(Remote)
1 Week ago
Yodlee - Senior Data Engineer

Yodlee

Thiruvananthapuram, Kerala, India (On-Site)
2 Weeks ago
GMS Services - Senior System Administrator Big Data

GMS Services

Hamburg, Hamburg, Germany (On-Site)
10 Years ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Tesla - Data Scientist

Tesla

Brandenburg, Germany (On-Site)
3 Months ago
Canonical - Cloud Field Engineer

Canonical

(Remote)
1 Week ago
Luxoft - Regular Data Engineer

Luxoft

(Remote)
6 Months ago
ByteDance - Data Quality Assurance Engineer - Data Platform 2025 Start

ByteDance

Singapore (On-Site)
6 Months ago
Aristocrat Gaming - Senior Data Science Director

Aristocrat Gaming

London, England, United Kingdom (Hybrid)
2 Months ago
Fanatee - Product Business Analyst

Fanatee

(Remote)
1 Month ago
PwC - IN-Senior Associate_PySpark Developer_Data & Analytics_Advisory_ PAN India

PwC

Gurugram, Haryana, India (On-Site)
7 Months ago
Nium - Data Engineer II

Nium

Hyderabad, Telangana, India (On-Site)
2 Months ago
Reality Games - Machine Learning Engineer - Monopoly World

Reality Games

Kraków, Lesser Poland Voivodeship, Poland (On-Site)
3 Months ago
ByteDance - Site Reliability Engineer Intern

ByteDance

San Jose, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in Bengaluru, Karnataka, India

Sportskeeda - Content Writer

Sportskeeda

India (Remote)
4 Months ago
Qualcomm - BT - Design Verification Sr Lead Engineer

Qualcomm

Bengaluru, Karnataka, India (On-Site)
2 Weeks ago
Axinous - Associate, Strategic Finance/Investor Relations

Axinous

Hyderabad, Telangana, India (Remote)
2 Months ago
Contentstack - Software Engineer II (ReactJS/NextJS)

Contentstack

Virar, Maharashtra, India (On-Site)
4 Weeks ago
AVIZVA - Manager - Visual Product Design

AVIZVA

Gurugram, Haryana, India (On-Site)
8 Months ago
Assystems - DevOps Engineer

Assystems

Gurugram, Haryana, India (On-Site)
7 Months ago
Luxoft - Solutions Architect

Luxoft

Gurugram, Haryana, India (On-Site)
5 Months ago
NVIDIA - DFT Engineer - Hardware

NVIDIA

Bengaluru, Karnataka, India (On-Site)
1 Month ago
P99 soft - Python Developer

P99 soft

Hyderabad, Telangana, India (On-Site)
3 Weeks ago
Capgemini - Mechanical & Physical Engineer

Capgemini

Coimbatore, Tamil Nadu, India (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

PhonePe was founded in December 2015 and has emerged as India’s largest payments app, enabling digital inclusion for consumers and merchants alike. With 48 crore (480 Million) registered users, one in four Indians are now on PhonePe. The company has also successfully digitized 3.6 crore (36 Million) offline merchants spread across Tier 2,3,4 and beyond, covering 99% of the postal codes across India. PhonePe is also the leader in Bharat Bill Pay System (BBPS), processing over 45% of the transactions on the BBPS platform. PhonePe forayed into financial services in 2017, providing users with safe and convenient investing options on its platform. Since then, the company has introduced several Mutual Funds and Insurance products that offer every Indian an equal opportunity to unlock the flow of money and access to services. PhonePe was recently recognized as the Most Trusted Brand for Digital Payments as per the Brand Trust Report 2023 by Trust Research Advisory (TRA).



Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Ranchi, Jharkhand, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

View All Jobs

Get notified when new jobs are added by Phonepe

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug