Site Reliability Engineer

Progress

| Hyderabad, Telangana, India (Hybrid) | Full Time | 3 months ago

Apply Now

Job Summary

The Site Reliability Engineer (SRE) will ensure the reliability, availability, and performance of software systems. This involves designing, building, and maintaining scalable infrastructure, implementing monitoring and alerting, and automating processes. The role requires collaboration with developers, root cause analysis, and participation in on-call rotations to continuously improve system reliability and performance.

Must Have

Design, build, and maintain scalable and reliable infrastructure using modern cloud technologies.
Develop and implement monitoring and alerting systems.
Collaborate with software developers to improve system architecture and performance.
Automate manual processes to increase efficiency and reduce downtime.
Conduct root cause analysis for incidents and implement preventive measures.
Participate in on-call rotation to respond to critical incidents.
Continuously evaluate and improve the reliability and performance of systems.
Document system configurations, processes, and best practices.
Bachelor’s degree in computer science, Engineering, or a related field.
Proven experience as a Site Reliability Engineer or similar role.
Strong understanding of cloud computing platforms (e.g., AWS, Google Cloud, Azure).
Proficiency in scripting languages such as Python, Bash, or PowerShell.
Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
Solid understanding of networking principles and protocols.
Familiarity with infrastructure as code tools such as Terraform or Ansible.
Excellent problem-solving and troubleshooting skills.
Ability to work independently and collaboratively in a fast-paced environment.
Strong communication skills and ability to work effectively with cross-functional teams.

Good to Have

Certifications in cloud computing (e.g., AWS Certified DevOps Engineer, Google Professional Cloud Architect).
Experience with CI/CD pipelines and automation tools (e.g., Jenkins, GitLab CI).
Knowledge of database technologies (e.g., MySQL, PostgreSQL, MongoDB).
Familiarity with Agile methodologies and DevOps practices.
Experience with performance tuning and optimization of distributed systems.

Perks & Benefits

Competitive remuneration package
Employee Stock Purchase Plan Enrolment
30 days of earned leave
An extra day off for your birthday
Various other leaves like marriage leave, casual leave, maternity leave, and paternity leave
Premium Group Medical Insurance for employees and five dependents
Personal accident insurance coverage
Life insurance coverage
Professional development reimbursement
Interest subsidy on loans - either vehicle or personal loans

Job Description

We are Progress (Nasdaq: PRGS) - the trusted provider of software that enables our customers to develop, deploy and manage responsible, AI-powered applications and experience with agility and ease.

We’re proud to have a diverse, global team where we value the individual and enrich our culture by considering varied perspectives because we believe people power progress. Join us as a Site Reliability Engineer (SRE) and help us do what we do best: propelling business forward. Learn More about us.

The Site Reliability Engineer (SRE) will be responsible for ensuring the reliability, availability, and performance of our company's software systems. This role will involve designing, building, and maintaining scalable infrastructure, implementing monitoring and alerting systems, and automating processes to streamline operations.

In this role, you will:

Design, build, and maintain scalable and reliable infrastructure using modern cloud technologies.
Develop and implement monitoring and alerting systems to proactively identify and address issues.
Collaborate with software developers to improve system architecture and performance.
Automate manual processes to increase efficiency and reduce downtime.
Conduct root cause analysis for incidents and implement preventive measures.
Participate in on-call rotation to respond to critical incidents outside of business hours.
Continuously evaluate and improve the reliability and performance of our systems.
Document system configurations, processes, and best practices.

Your background:

Bachelor’s degree in computer science, Engineering, or a related field.
Proven experience as a Site Reliability Engineer or similar role.
Strong understanding of cloud computing platforms (e.g., AWS, Google Cloud, Azure).
Proficiency in scripting languages such as Python, Bash, or PowerShell.
Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
Solid understanding of networking principles and protocols.
Familiarity with infrastructure as code tools such as Terraform or Ansible.
Excellent problem-solving and troubleshooting skills.
Ability to work independently and collaboratively in a fast-paced environment.
Strong communication skills and ability to work effectively with cross-functional teams.

Preferred Qualifications:

Certifications in cloud computing would be preferred (e.g., AWS Certified DevOps Engineer, Google Professional Cloud Architect).
Experience with CI/CD pipelines and automation tools (e.g., Jenkins, GitLab CI).
Knowledge of database technologies (e.g., MySQL, PostgreSQL, MongoDB).
Familiarity with Agile methodologies and DevOps practices.
Experience with performance tuning and optimization of distributed systems.

If this sounds like you and fits your experience and career goals, we’d be happy to chat. What we offer in return is the opportunity to experience a great company culture with wonderful colleagues to learn from and collaborate with, and also to enjoy:

Compensation

Competitive remuneration package
Employee Stock Purchase Plan Enrolment

Vacation, Family, and Health

30 days of earned leave
An extra day off for your birthday
Various other leaves like marriage leave, casual leave, maternity leave, and paternity leave
Premium Group Medical Insurance for employees and five dependents, personal accident insurance coverage, and life insurance coverage
Professional development reimbursement
Interest subsidy on loans - either vehicle or personal loans.

Apply now!

#LI-SR1

#LI-Hybrid

Together, We Make Progress

Progress is an inclusive workplace where opportunities to succeed are available to everyone. As a multicultural company serving a global community, we encourage a wide range of points of view and celebrate our diverse backgrounds. Our unique combination of perspectives inspires innovation, connects us to our customers and positively affects our communities. It is only by working together and learning from each other that we make Progress. Join us!

21 Skills Required For This Role

Cross Functional Problem Solving Communication Game Texts Agile Development Gitlab Mysql Postgresql Networking Aws Azure Ansible Terraform Powershell Mongodb Ci Cd Docker Kubernetes Python Bash Jenkins