Site Reliability Engineer I

1 Month ago • 2-4 Years • Devops

Job Summary

Job Description

Zeta is the world's first and only Omni Stack for banks and fintechs, aiming to augment money and banking with technology. This high-growth unicorn, founded in 2015, serves over 10 banks and 25 fintechs across 8 countries, including notable clients like Sodexo and HDFC Bank. The Site Reliability Engineer I will ensure system reliability, automate operational tasks, respond to incidents, plan capacity, optimize performance, and implement Infrastructure as Code.
Must have:
  • Ensuring the reliability of software systems by designing, implementing, and maintaining scalable and reliable infrastructure.
  • Developing automation tools and scripts to streamline operational tasks, reduce manual intervention, and improve overall system efficiency.
  • Monitoring system performance and responding to incidents promptly to minimize downtime and ensure high availability.
  • Analyzing system usage patterns and forecasting future capacity needs to ensure that the infrastructure can handle current and future demands.
  • Identifying and addressing performance bottlenecks in software systems through optimization and tuning.
  • Implementing infrastructure as code practices, using tools like Terraform or Ansible, to define and manage infrastructure in a version-controlled and automated manner.
  • Implementing and maintaining monitoring and logging solutions to gain insights into system behavior, troubleshoot issues, and proactively address potential problems.
  • Participating in an on-call rotation to respond to incidents outside of regular working hours and ensure 24/7 system availability.
  • Collaborating with security teams to implement and maintain security best practices in infrastructure and application.
  • Developing and maintaining disaster recovery plans to ensure that systems can quickly recover from major outages or failures.
  • Continuously analyzing system performance, reliability, and incidents to identify areas for improvement and implementing changes to enhance overall system resilience.
Good to have:
  • Experience working for a product organization

Job Details

All about Zeta Suite :

Zeta is the world’s first and only Omni Stack for banks and fintechs. We are rethinking payments from core to the edge, led by the vision to augment the purpose of money and banking with technology. A single, modern software stack comprising processing, loans, customizable mobile and web apps, fraud engine, and rewards for retail banking. We are a new-age, high-growth startup (& a unicorn!) founded in 2015 by two visionary leaders Bhavin Turakhia & Ramki Gaddipati, whose entrepreneurial legacy & excellence has put us on top of the global fintech ecosystem. Zeta counts amongst its customers, over 10 banks and 25 fintechs, across 8 countries - some of our notable clients include Sodexo - a leading issuer of employee benefits & rewards with over 30 million global users, and HDFC Bank - the 14th largest bank by market cap in the world. Learn more about our manifesto & beyond.

Responsibilities

  • System Reliability: Ensuring the reliability of software systems by designing, implementing, and maintaining scalable and reliable infrastructure.
  • Automation: Developing automation tools and scripts to streamline operational tasks, reduce manual intervention, and improve overall system efficiency.
  • Incident Response and Resolution: Monitoring system performance and responding to incidents promptly to minimize downtime and ensure high availability.
  • Capacity Planning: Analyzing system usage patterns and forecasting future capacity needs to ensure that the infrastructure can handle current and future demands.
  • Performance Optimization: Identifying and addressing performance bottlenecks in software systems through optimization and tuning.
  • Infrastructure as Code (IaC): Implementing infrastructure as code practices, using tools like Terraform or Ansible, to define and manage infrastructure in a version-controlled and automated manner.
  • Monitoring and Logging: Implementing and maintaining monitoring and logging solutions to gain insights into system behavior, troubleshoot issues, and proactively address potential problems.
  • On-Call Support: Participating in an on-call rotation to respond to incidents outside of regular working hours and ensure 24/7 system availability
  • Security: Collaborating with security teams to implement and maintain security best practices in infrastructure and application
  • Disaster Recovery Planning: Developing and maintaining disaster recovery plans to ensure that systems can quickly recover from major outages or failures
  • Continuous Improvement: Continuously analyzing system performance, reliability, and incidents to identify areas for improvement and implementing changes to enhance overall system resilience.

Skills

  • Programming Languages: Proficiency in one or more programming languages, commonly Python, Go, Shell, Bash.
  • Automation and Scripting: Strong automation skills using tools like Ansible, Puppet, Chef, or custom scripts. Knowledge of Infrastructure as Code (IaC) tools like Terraform
  • Containerization and Orchestration: Experience with containerization technologies like Docker and container orchestration platforms like Kubernetes.
  • Cloud Computing: Proficiency in any of the cloud platforms such as AWS, Azure, or Google Cloud Platform, and knowledge of managing infrastructure in the cloud.
  • Monitoring and Logging: Familiarity with monitoring tools (e.g., Prometheus, Grafana, ELK stack) and logging frameworks to track system performance and troubleshoot issues.
  • Networking: Understanding of networking concepts, protocols, and troubleshooting skills.
  • Security: Knowledge of security best practices, including encryption, access controls, and vulnerability management.
  • Continuous Integration/Continuous Deployment (CI/CD): Understanding and implementation of CI/CD pipelines for automated testing and deployment.
  • Load Balancing: Experience in incident response, troubleshooting, and resolution.
  • Version Control: Proficient use of version control systems like Git.

Experience and Qualifications

  • 2-4 year of experience in site reliability engineering.
  • B.Tech/M.Tech in computer science, information technology or a related field.
  • Having experience working for a product organization is a plus.

Equal Opportunity

  • Zeta is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We encourage applicants from all backgrounds, cultures, and communities to apply and believe that a diverse workforce is key to our success

Similar Jobs

Interactive Brokers - Senior Software Engineer

Interactive Brokers

Greenwich, Connecticut, United States (On-Site)
10 Months ago
Scanline VFX - Senior Compositor

Scanline VFX

London, England, United Kingdom (Remote)
5 Months ago
Adobe - Senior Cyber Defense Analyst

Adobe

Sydney, New South Wales, Australia (On-Site)
3 Months ago
Roblox - Senior/Principal Software Engineer, Data Engineering

Roblox

San Mateo, California, United States (On-Site)
2 Months ago
Rennsportgg - Site Reliability Engineer (f/m/x)

Rennsportgg

Munich, Bavaria, Germany (Remote)
3 Months ago
TechVedika - SRE

TechVedika

Hyderabad, Telangana, India (On-Site)
4 Months ago
WebFX - Jr. MarTech Solutions Architect

WebFX

Harrisburg, Pennsylvania, United States (On-Site)
10 Months ago
Google - Software Engineer III, Infrastructure, Platforms Infrastructure Engineering

Google

Sunnyvale, California, United States (On-Site)
8 Months ago
zeta - Lead Site Reliability Engineer

zeta

Hyderabad, Telangana, India (On-Site)
3 Months ago
Ansys - Lead R&D Engineer (Cloud Platform Developer)

Ansys

Waterloo, Ontario, Canada (Remote)
3 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Wushu studios - Senior Programmer

Wushu studios

Liverpool, England, United Kingdom (Remote)
1 Month ago
Nintendo - Supply Chain Analyst

Nintendo

North Bend, Washington, United States (Hybrid)
6 Months ago
Qualcomm - DSP / NPU Design Verification Sr Engineer

Qualcomm

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Nice - DevOps Engineer

Nice

Pune, Maharashtra, India (Hybrid)
1 Month ago
WebFX - Entry Level Software Engineer

WebFX

Harrisburg, Pennsylvania, United States (On-Site)
10 Months ago
Nordson Corporation - Technical Customer Engineer II (Field Service Engineer)

Nordson Corporation

Taipei City, Taiwan (On-Site)
2 Months ago
HCL Tech - Technical Lead - Embedded C

HCL Tech

Washington, United States (On-Site)
2 Months ago
Luxoft - Regular C++ Software Developer

Luxoft

Chennai, Tamil Nadu, India (On-Site)
9 Months ago
Ambient.ai - Senior Software Engineer, Ecosystems

Ambient.ai

Canada (Remote)
4 Months ago
ELk studios - 2D Animator

ELk studios

Stockholm, Stockholm County, Sweden (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Bengaluru, Karnataka, India

Capgemini - Order Management

Capgemini

Salem, Tamil Nadu, India (On-Site)
3 Months ago
Marvell - Network Platform Development Engineer

Marvell

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Capgemini - Tech BA

Capgemini

Pune, Maharashtra, India (On-Site)
3 Months ago
Luxoft - Senior Java Developer

Luxoft

Pune, Maharashtra, India (On-Site)
9 Months ago
Capgemini - Oracle Integration Cloud (OIC) Manager

Capgemini

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Capgemini - HR Operational Excellence Analyst

Capgemini

Kolkata, West Bengal, India (On-Site)
1 Month ago
luxsoft - Senior Security Coordinator

luxsoft

Pune, Maharashtra, India (On-Site)
2 Months ago
Paytm - Process Review - Internal Audit

Paytm

Noida, Uttar Pradesh, India (On-Site)
9 Months ago
Rippling - Software Engineer II - Global Payroll

Rippling

Bengaluru, Karnataka, India (On-Site)
8 Months ago
ISS Stoxx - Analyst, ISS MI Research Production Services

ISS Stoxx

Mumbai, Maharashtra, India (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Zuora - Sr Enterprise Solution Architect-Zuora Billing & CPQ

Zuora

United States (Remote)
3 Months ago
Kulfi Collective - Lead AI & Platform Engineer

Kulfi Collective

Mumbai, Maharashtra, India (On-Site)
2 Months ago
Regent craft - Senior Software Infrastructure Engineer

Regent craft

North Kingstown, Rhode Island, United States (On-Site)
2 Months ago
Deepgram - Solutions Architect

Deepgram

California, United States (Remote)
2 Months ago
extreme network - Senior/Staff Systems Software Engineer – Linux Platform & Virtualization

extreme network

Ontario, Canada (Hybrid)
5 Months ago
Rackspace Technology - Site Reliability Engineer III

Rackspace Technology

India (Remote)
5 Months ago
Brillio - PCF to Azure AKS Migration Architect - R01531191

Brillio

Bengaluru, Karnataka, India (Hybrid)
10 Months ago
CyberArk - Senior Solutions Engineer - Israel & The Balkans

CyberArk

Israel (Hybrid)
1 Month ago
Ion - DevOps Engineer

Ion

Budapest, Hungary (On-Site)
1 Year ago

Get notifed when new similar jobs are uploaded

About The Company

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

View All Jobs

Get notified when new jobs are added by zeta

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug