Site Reliability Engineer - Systems

3 Hours ago • 7-10 Years • Devops

Job Summary

Job Description

PhonePe is a leading digital payments company in India with a vast user base and merchant network, expanding into financial services and other tech-enabled businesses. We are looking for a skilled Site Reliability Engineer (SRE) with extensive Linux systems administration experience and database management knowledge. The role involves ensuring the reliability, performance, and scalability of production environments by balancing system and database stability through monitoring, debugging, and automation. Responsibilities include leading incident response, conducting root cause analysis, designing monitoring solutions, driving automation for Linux tasks and databases, collaborating on resilient deployments, architecting scalable infrastructure, enhancing on-call effectiveness, and mentoring junior engineers. The ideal candidate will possess deep Linux expertise, advanced troubleshooting skills, proficiency in configuration management tools, experience with containerization and orchestration, monitoring platforms, scripting languages, networking fundamentals, and database administration principles.
Must have:
  • Advanced Linux systems administration
  • Troubleshooting and debugging complex system issues
  • Configuration management tools (e.g., SaltStack, Ansible)
  • Containerization and orchestration (e.g., Docker, Kubernetes)
  • Monitoring and alerting tools (e.g., Grafana, Prometheus)
  • Automation scripting (e.g., Python, Shell, Go)
  • Networking fundamentals (TCP/IP, DNS, BGP)
  • Database administration fundamentals
Good to have:
  • Cloud infrastructure experience
  • MariaDB, Percona Server, or MySQL administration
  • Backup and recovery technologies (e.g., ZFS)
  • Message queuing systems (e.g., RabbitMQ)
Perks:
  • Medical Insurance
  • Critical Illness Insurance
  • Accidental Insurance
  • Life Insurance
  • Employee Assistance Program
  • Onsite Medical Center
  • Emergency Support System
  • Maternity Benefit
  • Paternity Benefit Program
  • Adoption Assistance Program
  • Day-care Support Program
  • Relocation benefits
  • Transfer Support Policy
  • Travel Policy
  • Employee PF Contribution
  • Flexible PF Contribution
  • Gratuity
  • NPS
  • Leave Encashment
  • Higher Education Assistance
  • Car Lease
  • Salary Advance Policy

Job Details

About PhonePe Group: 

PhonePe is India’s leading digital payments company with 50 crore (500 Million) registered users and 3.7 crore (37 Million) merchants covering over 99% of the postal codes across India. On the back of its leadership in digital payments, PhonePe has expanded into financial services (Insurance, Mutual Funds, Stock Broking, and Lending) as well as adjacent tech-enabled businesses such as Pincode for hyperlocal shopping and Indus App Store which is India's first localized App Store. The PhonePe Group is a portfolio of businesses aligned with the company's vision to offer every Indian an equal opportunity to accelerate their progress by unlocking the flow of money and access to services.

Culture

At PhonePe, we take extra care to make sure you give your best at work, Everyday! And creating the right  environment for you is just one of the things we do. We empower people and trust them to do the right  thing. Here, you own your work from start to finish, right from day one. Being enthusiastic about tech is a  big part of being at PhonePe. If you like building technology that impacts millions, ideating with some of  the best minds in the country and executing on your dreams with purpose and speed, join us!

Site Reliability Engineer - System

Expeience: 7 to 10 Years

Summary 

We are seeking a skilled and proactive Site Reliability Engineer (SRE) to join our team. The ideal candidate will have extensive experience in Linux systems administration, understanding of database management, and a proven track record of troubleshooting complex, system-level issues. You will be responsible for ensuring the reliability, performance, and scalability of our production environments, balancing system and database stability through robust monitoring, debugging, and automation practices.

Responsibilities:

  • Lead incident response and resolution: Proactively troubleshoot, debug, and resolve complex system-level incidents and outages, encompassing Linux operating systems, applications, and database technologies.
  • Conduct deep-dive root cause analysis: Perform thorough post-incident analysis to identify underlying issues in production environments, implementing sustainable solutions.
  • Design and implement robust monitoring: Develop, maintain, and enhance comprehensive system and database monitoring, alerting, and observability solutions (e.g., Grafana, Prometheus, PMM).
  • Drive automation and efficiency: Automate Linux system administration tasks, operational runbooks, and database maintenance to improve system reliability, consistency, and operational efficiency.
  • Collaborate on resilient deployments: Partner with development and engineering teams to ensure seamless, reliable, and secure software deployments and infrastructure changes.
  • Architect scalable infrastructure: Contribute to the architectural design and implementation of highly scalable, resilient, and performant infrastructure solutions.
  • Enhance on-call effectiveness: Participate in and continuously improve on-call rotations, developing tools and processes to reduce alert fatigue and minimize human error.
  • Foster technical growth: Mentor and guide junior Site Reliability Engineers (SREs), promoting knowledge sharing and skill development within the team.

Qualifications:

  • Extensive Linux Expertise: Proven experience in advanced Linux systems administration, including deep understanding of file systems, kernel tuning (Sysctl), and performance optimization.
  • Advanced Troubleshooting & Debugging: Exceptional ability to debug and rapidly resolve complex, distributed system-level issues in high-pressure production environments.
  • Configuration Management: Hands-on experience with industry-standard configuration management tools (e.g., SaltStack, Ansible, Puppet).
  • Load Balancing & Proxying: Practical experience with load balancing technologies (e.g., Nginx, HAProxy, LVS) and their configuration for high availability.
  • Containerization & Orchestration: Strong understanding and practical experience with containerization (e.g., Docker) and container orchestration platforms (e.g., Kubernetes, Mesosphere).
  • Monitoring & Alerting Tooling: Proficiency in implementing, maintaining, and leveraging system and database monitoring platforms (e.g., Grafana, Prometheus, PMM) and custom scripting for alerts.
  • Automation & Scripting Mastery: Highly proficient in developing automation solutions using scripting languages (e.g., Python, Shell scripting, Go) for operational tasks.
  • Networking Fundamentals: Solid understanding of core networking concepts and protocols (e.g., TCP/IP, DNS, DHCP, BGP, IPTables, IP & Routing protocols).
  • Database Administration Fundamentals: Strong grasp of relational database concepts and practical experience with database administration principles.

Preferred Qualifications:

  • Cloud Infrastructure Experience: Experience managing and troubleshooting private/on-premise cloud environments, with a focus on identifying and mitigating hardware-related issues and their impact.
  • Relational Database Specialization: Deep practical experience with MariaDB, Percona Server, and/or MySQL, encompassing advanced database administration, performance tuning, and complex replication topologies.
  • Backup & Recovery Expertise: Hands-on experience with robust backup and restore technologies, including ZFS.
  • Message Queuing Systems: Familiarity with message queuing systems like RabbitMQ (RMQ).

PhonePe Full Time Employee Benefits (Not applicable for Intern or Contract Roles)

  • Insurance Benefits - Medical Insurance, Critical Illness Insurance, Accidental Insurance, Life Insurance
  • Wellness Program - Employee Assistance Program, Onsite Medical Center, Emergency Support System
  • Parental Support - Maternity Benefit, Paternity Benefit Program, Adoption Assistance Program, Day-care Support Program
  • Mobility Benefits - Relocation benefits, Transfer Support Policy, Travel Policy
  • Retirement Benefits - Employee PF Contribution, Flexible PF Contribution, Gratuity, NPS, Leave Encashment 
  • Other Benefits - Higher Education Assistance, Car Lease, Salary Advance Policy

Working at PhonePe is a rewarding experience! Great people, a work environment that thrives on creativity, the opportunity to take on roles beyond a defined job description are just some of the reasons you should work with us. Read more about PhonePe on our blog.

Life at PhonePe

PhonePe in the news

Similar Jobs

DNEG - Creature TD - CFX

DNEG

Mumbai, Maharashtra, India (On-Site)
3 Months ago
Visa - Staff Systems Engineer - Splunk Administrator - PRE

Visa

Austin, Texas, United States (Hybrid)
8 Months ago
Info Stretch - Senior Java Engineer

Info Stretch

Krakow Am See, Mecklenburg-Vorpommern, Germany (On-Site)
7 Months ago
Buckman - Associate Digital Innovation Engineer - Ackumen Support

Buckman

Chennai, Tamil Nadu, India (On-Site)
8 Months ago
Universal Music - Manager, North America Catalog Services

Universal Music

Franklin, Tennessee, United States (On-Site)
3 Months ago
Tesla - Service Technician / Automotive Mechanic

Tesla

Ferndown, England, United Kingdom (On-Site)
4 Months ago
Penumbra - Benefits Program Manager

Penumbra

Alameda, California, United States (On-Site)
8 Months ago
Nagarro - Associate Engineer

Nagarro

New York, New York, United States (On-Site)
7 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Digital Extremes - Senior Data Engineer

Digital Extremes

(Remote)
3 Months ago
Hapag-Lloyd AG - Senior Software Engineer

Hapag-Lloyd AG

Chennai, Tamil Nadu, India (On-Site)
7 Months ago
The Walt Disney Company - Texture Supervisor (Feature Animation & VFX)

The Walt Disney Company

Vancouver, British Columbia, Canada (Hybrid)
4 Months ago
The Walt Disney Company - Technical Assistant

The Walt Disney Company

London, England, United Kingdom (Hybrid)
4 Months ago
PwC - L3 SIEM (Security Information and Event Management) SME

PwC

Kuala Lumpur, Federal Territory Of Kuala Lumpur, Malaysia (On-Site)
9 Months ago
Virtusa - Manual Tester

Virtusa

Austin, Texas, United States (On-Site)
11 Months ago
Interactive Brokers - Professional Services Representative

Interactive Brokers

Greenwich, Connecticut, United States (On-Site)
8 Months ago
Epic Games - Machine Learning Ops Engineer

Epic Games

London, England, United Kingdom (On-Site)
5 Months ago
Visa - Senior Software Engineer - Backend

Visa

Warsaw, Masovian Voivodeship, Poland (Hybrid)
8 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Bengaluru, Karnataka, India

Outscal - Sales Lead

Outscal

Delhi, India (On-Site)
8 Months ago
Nagarro - Senior Staff Engineer, ServiceNow

Nagarro

India (Remote)
8 Months ago
Rackspace Technology - AWS Support Engineer L2

Rackspace Technology

Gurugram, Haryana, India (Remote)
4 Months ago
PwC - IN- Senior Associate – D365 Technical -Ms Dynamics– Advisory  - Mumbai

PwC

Mumbai, Maharashtra, India (On-Site)
8 Months ago
Warner Bros Games - Senior Software Engineer - Roku (Adtech Team)

Warner Bros Games

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
Sporty Group - IN Lead- Customer Success (Gurugram)

Sporty Group

Delhi, India (On-Site)
10 Months ago
PwC - Senior Associate-SAP FICO-Kolkata-RDC

PwC

Kolkata, West Bengal, India (On-Site)
9 Months ago
Nagarro - Senior Engineer, Frontend Angular2x

Nagarro

Gurugram, Haryana, India (On-Site)
8 Months ago
Nagarro - Staff Engineer, Machine Learning

Nagarro

India (Remote)
8 Months ago
Ubisoft - Senior Game Designer

Ubisoft

Mumbai, Maharashtra, India (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Keywords Studios - Office Administration Specialist

Keywords Studios

Silesian Voivodeship, Poland (On-Site)
3 Months ago
AGS - American Gaming Systems - Field Service Technician II - RI

AGS - American Gaming Systems

Rhode Island, United States (On-Site)
3 Months ago
Hologate gmbh - IT Administrator

Hologate gmbh

Munich, Bavaria, Germany (On-Site)
2 Months ago
Next Level Business Services - SAP Technical Archiving Specialist

Next Level Business Services

Saint Paul, Minnesota, United States (On-Site)
8 Months ago
AGS - American Gaming Systems - Licensing Specialist I

AGS - American Gaming Systems

Nevada, United States (On-Site)
3 Months ago
Paper Games - Audio Business (Spring 2025 Recruitment)

Paper Games

Shanghai, Shanghai, China (On-Site)
3 Months ago
Next Level Business Services - SAP PI/PO Consultant

Next Level Business Services

Santa Clara, California, United States (On-Site)
8 Months ago
The Walt Disney Company - Veterinary Technician

The Walt Disney Company

Lake Buena Vista, Florida, United States (On-Site)
3 Months ago
Universal Music - Assistant Store Manager, eCommerce

Universal Music

Philadelphia, Pennsylvania, United States (On-Site)
3 Months ago
SmileGate - [CTO본부] 그룹사 구매 담당자 (IT인프라-보안/용역/마케팅)

SmileGate

Seongnam-si, Gyeonggi-do, South Korea (On-Site)
6 Months ago

Get notifed when new similar jobs are uploaded

About The Company

PhonePe was founded in December 2015 and has emerged as India’s largest payments app, enabling digital inclusion for consumers and merchants alike. With 48 crore (480 Million) registered users, one in four Indians are now on PhonePe. The company has also successfully digitized 3.6 crore (36 Million) offline merchants spread across Tier 2,3,4 and beyond, covering 99% of the postal codes across India. PhonePe is also the leader in Bharat Bill Pay System (BBPS), processing over 45% of the transactions on the BBPS platform. PhonePe forayed into financial services in 2017, providing users with safe and convenient investing options on its platform. Since then, the company has introduced several Mutual Funds and Insurance products that offer every Indian an equal opportunity to unlock the flow of money and access to services. PhonePe was recently recognized as the Most Trusted Brand for Digital Payments as per the Brand Trust Report 2023 by Trust Research Advisory (TRA).



Pune, Maharashtra, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

View All Jobs

Get notified when new jobs are added by PhonePe

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug