Site Reliability Engineer (PA2025Q3JB086)

SSC Technologies

| Hyderabad, Telangana, India (On Site) | Full Time | 1 months ago

Apply Now

Job Summary

As a Site Reliability Engineer (SRE) at SS&C, you will be crucial in ensuring the scalability, reliability, and performance of our systems, infrastructure, and applications. You will collaborate with engineering, system administration, and DevOps teams to design and implement solutions that enhance uptime, system availability, and overall service health. Key responsibilities include maintaining infrastructure, designing monitoring and incident response systems, developing automation tools, and participating in on-call rotations to minimize downtime.

Must Have

Maintain scalable and reliable infrastructure to support mission-critical systems.
Design and implement monitoring, alerting, and incident response systems.
Develop tools and automation to eliminate manual operations and improve system efficiency.
Work with development teams to ensure reliability and performance are considered from the start.
Conduct root cause analysis and postmortems to learn from system failures and prevent recurrence.
Participate in on-call rotations and respond to incidents, minimizing downtime and customer impact.
Continuously improve deployment, configuration, and observability processes.
Strong experience with Linux/Unix systems administration.
Proficient in scripting and programming languages such as Python, Go, or Bash.
Hands-on experience with cloud platforms such as AWS and infrastructure-as-code tools (Terraform, Ansible).
Experience with containerization and orchestration tools (Docker, Kubernetes, Linux, Windows, Citrix).
Deep understanding of CI/CD pipelines, networking, and security best practices.
Excellent troubleshooting and problem-solving skills.
Strong communication and collaboration abilities.

Good to Have

Experience with large-scale distributed systems.
Familiarity with SLAs, SLOs, and SLIs.
Previous experience in a DevOps or SRE role in a production environment.

Job Description

Job Title: Site Reliability Engineer (SRE)

Job Description:

We are looking for a highly skilled Site Reliability Engineer (SRE) to join our engineering team. As an SRE, you will be responsible for ensuring the scalability, reliability, and performance of our systems, infrastructure, and applications. You will collaborate closely with software engineers, system administrators, and DevOps professionals to design and implement solutions that improve uptime, system availability, and overall service health.

Key Responsibilities:

Maintain scalable and reliable infrastructure to support mission-critical systems.
Design and implement monitoring, alerting, and incident response systems to ensure high availability and performance.
Develop tools and automation to eliminate manual operations and improve system efficiency.
Work with development teams to ensure reliability and performance are considered from the start.
Conduct root cause analysis and postmortems to learn from system failures and prevent recurrence.
Participate in on-call rotations and respond to incidents, minimizing downtime and customer impact.
Continuously improve deployment, configuration, and observability processes.

Qualifications:

Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience.
Strong experience with Linux/Unix systems administration.
Proficient in scripting and programming languages such as Python, Go, or Bash.
Hands-on experience with cloud platforms such as AWS and infrastructure-as-code tools (Terraform, Ansible)
Experience with containerization and orchestration tools (Docker, Kubernetes, Linux, Windows, Citrix).
Deep understanding of CI/CD pipelines, networking, and security best practices.
Excellent troubleshooting and problem-solving skills.
Strong communication and collaboration abilities.

Preferred Qualifications:

Experience with large-scale distributed systems.
Familiarity with SLAs, SLOs, and SLIs.
Previous experience in a DevOps or SRE role in a production environment.

Unless explicitly requested or approached by SS&C Technologies, Inc. or any of its affiliated companies, the company will not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services.

SS&C Technologies is an Equal Employment Opportunity employer and does not discriminate against any applicant for employment or employee on the basis of race, color, religious creed, gender, age, marital status, sexual orientation, national origin, disability, veteran status or any other classification protected by applicable discrimination laws.

15 Skills Required For This Role

Problem Solving Talent Acquisition Game Texts Networking Linux Incident Response Aws Unix Ansible Terraform Ci Cd Docker Kubernetes Python Bash

Similar Jobs

Site Reliability Engineer (PA2025Q3JB086)

Job Summary

Must Have

Good to Have

Job Description

Job Description

15 Skills Required For This Role

Similar Jobs

Devops

Software Development & Engineering