Site Reliability Engineer - Big Data

1 Month ago • 5-7 Years • Data Analysis

Job Summary

Job Description

PhonePe is seeking a Site Reliability Engineer with 5 to 7 years of experience in Big Data to ensure the stability, scalability, and performance of distributed systems. Responsibilities include managing and automating the Hadoop ecosystem (HDFS, HBase, Hive, Airflow, YARN, Ranger, Kafka, Pinot, Druid), performing capacity planning, system tuning, and optimization. The role involves handling incidents, conducting root cause analysis, and implementing mitigation strategies. You will also be responsible for system updates, building observability tools, and participating in Kerberos and LDAP administration. The engineer will collaborate with various teams to ensure data availability and quality, applying system updates and patches.
Must have:
  • Ensure stability, scalability, and performance of Hadoop ecosystem
  • Manage Hadoop infrastructure (HDFS, HBase, Hive, etc.)
  • Automate operations via scripting
  • Perform capacity planning and system tuning
  • Configure and manage Nginx
  • Troubleshoot Linux and Big Data systems
  • Handle on-call responsibilities and incident management
  • Collaborate with infrastructure teams
  • Build tools for observability
  • Participate in Kerberos and LDAP administration
  • Experience with Linux system administration (min 1 year)
  • Hands-on Hadoop administration (over 4 years)
  • Proficient in scripting (Perl, Golang, or Python)
  • Strong operational knowledge of systems
  • Excellent communication skills
Good to have:
  • Design and maintain Airflow DAGs
  • ELK stack administration
  • Familiarity with monitoring tools (Grafana, Prometheus)
  • Exposure to security protocols (Kerberos, LDAP)
  • Familiarity with distributed systems (elasticsearch)
Perks:
  • Medical Insurance
  • Critical Illness Insurance
  • Accidental Insurance
  • Life Insurance
  • Employee Assistance Program
  • Onsite Medical Center
  • Emergency Support System
  • Maternity Benefit
  • Paternity Benefit Program
  • Adoption Assistance Program
  • Day-care Support Program
  • Relocation benefits
  • Transfer Support Policy
  • Travel Policy
  • Employee PF Contribution
  • Flexible PF Contribution
  • Gratuity
  • NPS
  • Leave Encashment
  • Higher Education Assistance
  • Car Lease
  • Salary Advance Policy

Job Details

About PhonePe Group: 

PhonePe is India’s leading digital payments company with 50 crore (500 Million) registered users and 3.7 crore (37 Million) merchants covering over 99% of the postal codes across India. On the back of its leadership in digital payments, PhonePe has expanded into financial services (Insurance, Mutual Funds, Stock Broking, and Lending) as well as adjacent tech-enabled businesses such as Pincode for hyperlocal shopping and Indus App Store which is India's first localized App Store. The PhonePe Group is a portfolio of businesses aligned with the company's vision to offer every Indian an equal opportunity to accelerate their progress by unlocking the flow of money and access to services.

Culture

At PhonePe, we take extra care to make sure you give your best at work, Everyday! And creating the right  environment for you is just one of the things we do. We empower people and trust them to do the right  thing. Here, you own your work from start to finish, right from day one. Being enthusiastic about tech is a  big part of being at PhonePe. If you like building technology that impacts millions, ideating with some of  the best minds in the country and executing on your dreams with purpose and speed, join us!

About the Role

As an SRE (5 to 7 years) (Big Data) Engineer at PhonePe, you will be responsible for ensuring the stability, scalability, and performance of distributed systems operating at scale. You will collaborate with development, infrastructure, and data teams to automate operations, reduce manual efforts, handle incidents, and continuously improve system reliability. This role requires strong problem-solving skills, operational ownership, and a proactive approach to mentoring and driving engineering excellence.

Roles and Responsibilities

  • Ensure the ongoing stability, scalability, and performance of PhonePe’s Hadoop ecosystem and associated services.
  • Manage and administer Hadoop infrastructure including HDFS, HBase, Hive, Pig, Airflow, YARN, Ranger, Kafka, Pinot, and Druid.
  • Automate BAU operations through scripting and tool development.
  • Perform capacity planning, system tuning, and performance optimization.
  • Set-up, configure, and manage Nginx in high-traffic environments.
  • Administration and troubleshooting of Linux + Bigdata systems, including networking (IP, Iptables, IPsec).
  • Handle on-call responsibilities, investigate incidents, perform root cause analysis, and implement mitigation strategies.
  • Collaborate with infrastructure, network, database, and BI teams to ensure data availability and quality.
  • Apply system updates, patches, and manage version upgrades in coordination with security teams.
  • Build tools and services to improve observability, debuggability, and supportability.
  • Participate in Kerberos and LDAP administration.
  • Experience in capacity planning and performance tuning of Hadoop clusters.
  • Work with configuration management and deployment tools like Puppet, Chef, Salt, or Ansible.

Skills Required

  • Minimum 1 year of Linux/Unix system administration experience.
  • Over 4 years of hands-on experience in Hadoop administration.
  • Minimum 1 years of experience managing infrastructure on public cloud platforms like AWS, Azure, or GCP (optional ) .
  • Strong understanding of networking, open-source tools, and IT operations.
  • Proficient in scripting and programming (Perl, Golang, or Python).
  • Hands-on experience with maintaining and managing the Hadoop ecosystem components like HDFS, Yarn, Hbase, Kafka .
  • Strong operational  knowledge in systems (CPU, memory, storage, OS-level troubleshooting).
  • Experience in administering and tuning relational and NoSQL databases.
  • Experience in configuring and managing Nginx in production environments.
  • Excellent communication and collaboration skills.

Good to Have

  • Experience designing and maintaining Airflow DAGs to automate scalable and efficient workflows.
  • Experience in ELK stack administration.
  • Familiarity with monitoring tools like Grafana, Loki, Prometheus, and OpenTSDB.
  • Exposure to security protocols and tools (Kerberos, LDAP).
  • Familiarity with distributed systems like elasticsearch or similar high-scale environments.

 

PhonePe Full Time Employee Benefits (Not applicable for Intern or Contract Roles)

  • Insurance Benefits - Medical Insurance, Critical Illness Insurance, Accidental Insurance, Life Insurance
  • Wellness Program - Employee Assistance Program, Onsite Medical Center, Emergency Support System
  • Parental Support - Maternity Benefit, Paternity Benefit Program, Adoption Assistance Program, Day-care Support Program
  • Mobility Benefits - Relocation benefits, Transfer Support Policy, Travel Policy
  • Retirement Benefits - Employee PF Contribution, Flexible PF Contribution, Gratuity, NPS, Leave Encashment 
  • Other Benefits - Higher Education Assistance, Car Lease, Salary Advance Policy

Working at PhonePe is a rewarding experience! Great people, a work environment that thrives on creativity, the opportunity to take on roles beyond a defined job description are just some of the reasons you should work with us. Read more about PhonePe on our blog.

Life at PhonePe

PhonePe in the news

Similar Jobs

Altagram Group - Sales & Partnership Coordinator – QA & Video Game Localization

Altagram Group

Montreal, Quebec, Canada (On-Site)
3 Weeks ago
warner bros games - Staff Software Engineer - Golang - QoE Platform

warner bros games

Bengaluru, Karnataka, India (Hybrid)
4 Months ago
Epic Games - Audio Director

Epic Games

Montreal, Quebec, Canada (On-Site)
3 Months ago
Localsoft games - Japanese Language Game Testers

Localsoft games

Málaga, Andalusia, Spain (On-Site)
1 Week ago
Qualcomm - Senior Engineer - Linux Product Integration Engineer

Qualcomm

Hyderabad, Telangana, India (On-Site)
2 Months ago
Synechron - Big Data Engineer

Synechron

Charlotte, North Carolina, United States (On-Site)
1 Month ago
Tencent - Game Operations (Data Analysis) --PUBGM Eastern Europe

Tencent

Shenzhen, Guangdong Province, China (On-Site)
2 Months ago
Crowd Strick - Software Engineer III - Platform Data

Crowd Strick

Romania (Remote)
3 Weeks ago
Morning Star - Senior Data Research Analyst

Morning Star

Mumbai, Maharashtra, India (Hybrid)
2 Months ago
Zelis  - Team Lead, Data Analysis

Zelis

Hyderabad, Telangana, India (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Amber - UX Designer (Project Based)

Amber

(Remote)
4 Months ago
ISS Stoxx - Software Test Engineer (Automation)

ISS Stoxx

Makati City, Metro Manila, Philippines (Hybrid)
1 Month ago
Remote - Sales Enablement Program Coordinator

Remote

(Remote)
2 Months ago
dun bradstreet - Senior Business Consultant

dun bradstreet

Frankfurt Am Main, Hessen, Germany (Hybrid)
5 Months ago
Cloud Imperium Games - Legal Administrator

Cloud Imperium Games

Manchester, England, United Kingdom (On-Site)
1 Week ago
Playstation - Tech Support Specialist

Playstation

San Diego, California, United States (On-Site)
2 Weeks ago
Moloco - Senior Manager, Financial Planning & Analysis

Moloco

Redwood City, California, United States (On-Site)
3 Weeks ago
Hawkeye Innovations - Football Video Systems Technician

Hawkeye Innovations

Pisa, Tuscany, Italy (On-Site)
1 Month ago
London stock Exchange - Tech Lead -Database SRE

London stock Exchange

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Bito - Sales Manager

Bito

Pune, Maharashtra, India (Hybrid)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Bengaluru, Karnataka, India

Gravitee - Senior Full Stack Engineer

Gravitee

Delhi, India (Remote)
9 Months ago
Capgemini - VoIp Engineer

Capgemini

Gurugram, Haryana, India (On-Site)
2 Months ago
Domo - DevOps Engineer - India

Domo

Pune, Maharashtra, India (Hybrid)
1 Week ago
Dream Sports - VP - Product (Platform)

Dream Sports

Mumbai, Maharashtra, India (On-Site)
12 Months ago
eBay - Senior Software Engineer, Kafka

eBay

India (Remote)
2 Weeks ago
Accenture - Clinical Data Services Associate

Accenture

Bengaluru, Karnataka, India (On-Site)
3 Weeks ago
Capgemini - Test Automation Lead

Capgemini

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Winzo - Public Policy

Winzo

New Delhi, Delhi, India (On-Site)
1 Month ago
Hapag-Lloyd AG - ITSM Practice Manager – Change enablement

Hapag-Lloyd AG

Chennai, Tamil Nadu, India (On-Site)
9 Months ago
Capgemini - Ignition Lead

Capgemini

Pune, Maharashtra, India (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Data Analysis Jobs

Apple - Data Scientist - Apple Ads

Apple

Cupertino, California, United States (On-Site)
2 Months ago
Match Group - Sr. Data & Analytics Engineer

Match Group

Seoul, South Korea (Hybrid)
2 Months ago
Autodesk - Principal Software Developer - Experimentation and Data Analytics

Autodesk

Ontario, Canada (Remote)
2 Weeks ago
Oliver Agency - GenAI Creative Optimisation Analyst (Senior Data Analyst)

Oliver Agency

Mumbai, Maharashtra, India (Hybrid)
1 Week ago
Ziff Davis - Data Scientist

Ziff Davis

Malaga, Western Australia, Australia (Remote)
2 Months ago
HYCU - Software Engineer – Cloud Data Protection

HYCU

Bengaluru, Karnataka, India (Hybrid)
1 Year ago
miniclip - Senior Data Analyst - 12 Month Fix-term

miniclip

London, England, United Kingdom (On-Site)
1 Month ago
CookUnity - Staff Data Engineer, Engineering Foundation

CookUnity

New York, United States (On-Site)
1 Week ago
Match Group - Product Data Analyst

Match Group

Tokyo, Japan (Hybrid)
1 Year ago
Trek - Business Analyst (Mobile)

Trek

Haryana, India (On-Site)
5 Months ago

Get notifed when new similar jobs are uploaded

About The Company

PhonePe was founded in December 2015 and has emerged as India’s largest payments app, enabling digital inclusion for consumers and merchants alike. With 48 crore (480 Million) registered users, one in four Indians are now on PhonePe. The company has also successfully digitized 3.6 crore (36 Million) offline merchants spread across Tier 2,3,4 and beyond, covering 99% of the postal codes across India. PhonePe is also the leader in Bharat Bill Pay System (BBPS), processing over 45% of the transactions on the BBPS platform. PhonePe forayed into financial services in 2017, providing users with safe and convenient investing options on its platform. Since then, the company has introduced several Mutual Funds and Insurance products that offer every Indian an equal opportunity to unlock the flow of money and access to services. PhonePe was recently recognized as the Most Trusted Brand for Digital Payments as per the Brand Trust Report 2023 by Trust Research Advisory (TRA).



Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Mumbai, Maharashtra, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

View All Jobs

Get notified when new jobs are added by PhonePe

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug