Site Reliability Engineer (SRE), Cloud Incident Response

2 Months ago • 5 Years + • Devops

Job Summary

Job Description

As a Site Reliability Engineer (SRE) you will be part of a global team, ensuring the performance, scalability, and reliability of critical cloud-based applications. You will collaborate with global teams, respond to and resolve application incidents, monitor applications using tools like Prometheus and Grafana, create and maintain dashboards, define and track SRE metrics, partner with development teams, analyze incident trends, automate support tasks, and participate in post-incident reviews. The role involves working in a fast-paced environment and contributing to the improvement of the platform's reliability and efficiency.
Must have:
  • Bachelor’s degree in related field.
  • 5+ years of experience or for junior roles fresh graduates are welcome.
  • Proficiency in programming languages, preferably Java, JavaScript or Python.
  • Ability to troubleshoot complex systems.
  • Skilled in debugging, code optimization, and automation.
  • Experience with relational databases and data analysis.
Good to have:
  • Experience working in Site Reliable Engineer (SRE) roles.
  • Hands-on experience with cloud infrastructure, preferably AWS.
  • Familiarity with observability tools such as Grafana, ELK Stack.
  • Experience deploying and managing applications on Kubernetes platforms.
  • Strong skills in analyzing and troubleshooting issues in large-scale, distributed systems.
Perks:
  • Hybrid Work Model and Business Casual Dress Code.
  • Retirement Program, Professional Development Reimbursement
  • Flexible Personal/Vacation Time Off, Sick Leave, Paid Holidays, Business Leave, Maternity Leave, Ordination Leave
  • Medical, Dental, Vision, Life Insurance, Annual Health Check Up, Employee Assistance Program, Parental Leave, Well-Stocked Pantry and Provident Fund Contribution
  • Committed to Welcoming, Celebrating and Thriving on Diversity
  • Hands-On, Team-Customized, including SS&C University
  • Paid further education opportunities for employees who are eligible
  • Bonus Scheme, SS&C Stock(s) Allocation for employees who are eligible
  • Welfare Committee: Discounts on fitness clubs, travel and more!

Job Details

As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries. Some 20,000 financial services and healthcare organizations, from the world's largest companies to small and mid-market firms, rely on SS&C for expertise, scale, and technology.

Job Description

Overall job purpose:

Be part of a global team that ensures the performance, scalability, and reliability of critical cloud-based applications. As part of the Global Investor and Distribution Solutions (GIDS) Platform Services team, you’ll play a key role in keeping our systems running smoothly and efficiently—while helping shape the future of our platform.

What You’ll Do:

  • Collaborate with global teams as part of a follow-the-sun support model.
  • Respond to, troubleshoot, and resolve Level 2 application incidents.
  • Ensure critical applications are effectively monitored using tools like Prometheus and Grafana.
  • Create and maintain dashboards and alerts to enhance visibility into application health.
  • Define, implement, and track key SRE metrics (SLOs, SLIs, error budgets).
  • Partner with development teams to improve application reliability and resilience.
  • Analyze incident trends and recommend improvements to reduce recurrence.
  • Automate repetitive support tasks to improve efficiency.
  • Participate in post-incident reviews and drive reliability initiatives.

Qualifications:

Minimum Qualification

  • Bachelor’s degree in Computer Science, Computer Engineering, IT, or related field.
  • 5+ years of experience for senior roles; fresh graduates welcome for junior roles.
  • Proficiency in one or more programming languages, preferably Java, JavaScript or Python.
  • Proven ability to troubleshoot complex systems.
  • Skilled in debuggingcode optimization, and automation.
  • Experience with relational databases and data analysis.

Highly Preferred

  • Experience working in Site Reliable Engineer (SRE) roles or incident response environments.
  • Hands-on experience with cloud infrastructure, preferably AWS.
  • Familiarity with observability tools such as Grafana, ELK Stack, or similar.
  • Experience deploying and managing applications on Kubernetes platforms.
  • Strong skills in analyzing and troubleshooting issues in large-scale, distributed systems.

Why You Will Love It Here!

  • Hybrid Work Model and Business Casual Dress Code, including jeans, Centralized location – 6 minutes’ walk from Phromphong BTS or 10 minutes’ walk from Sukhunvit MRT
  • Your Future: Retirement Program, Professional Development Reimbursement  
  • Work/Life Balance: Flexible Personal/Vacation Time Off, Sick Leave, Paid Holidays, Business Leave, Maternity Leave, Ordination Leave
  • Your Wellbeing: Medical, Dental, Vision, Life Insurance, Annual Health Check Up, Employee Assistance Program, Parental Leave, Well-Stocked Pantry and Provident Fund Contribution
  • Diversity & Inclusion: Committed to Welcoming, Celebrating and Thriving on Diversity
  • Hands-On, Team-Customized, including SS&C University
  • Paid further education opportunities for employees who are eligible
  • Extra Perks: Bonus Scheme, SS&C Stock(s) Allocation for employees who are eligible
  • Welfare Committee: Discounts on fitness clubs, travel and more!

#LI-NW1

#CA-NW

Unless explicitly requested or approached by SS&C Technologies, Inc. or any of its affiliated companies, the company will not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services.

SS&C Technologies is an Equal Employment Opportunity employer and does not discriminate against any applicant for employment or employee on the basis of race, color, religious creed, gender, age, marital status, sexual orientation, national origin, disability, veteran status or any other classification protected by applicable discrimination laws.

Similar Jobs

Ion - Senior Windows Engineer

Ion

Jersey City, New Jersey, United States (On-Site)
6 Months ago
Tesla - Mechanical Engineer, Technical Service, Vehicle Manufacturing

Tesla

Brandenburg, Germany (On-Site)
5 Months ago
Unity - Senior Backend Scala Developer

Unity

Tel Aviv-Yafo, Tel Aviv District, Israel (On-Site)
1 Week ago
Axon - HRIS Developer

Axon

Scottsdale, Arizona, United States (On-Site)
2 Months ago
Salesforce - AVP, Sales - Financial Services Key Accounts

Salesforce

New York, United States (On-Site)
2 Weeks ago
Kinetik - DevOps Engineer

Kinetik

Bangladesh (On-Site)
2 Months ago
CyberArk - Senior Cloud Engineer - ARS Team

CyberArk

Israel (Hybrid)
3 Months ago
version 1 - Oracle Cloud Service Delivery Manager

version 1

London, England, United Kingdom (Hybrid)
3 Months ago
Rivian - Staff Software Engineer, ML Training and Inference Infrastructure

Rivian

London, England, United Kingdom (On-Site)
1 Month ago
Nice - Specialist Software Engineer (Python/AWS)

Nice

Pune, Maharashtra, India (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Penrose studios - Lead Technical Director

Penrose studios

San Francisco, California, United States (On-Site)
2 Months ago
HCL Tech - Sr tech support spec - database support

HCL Tech

New Jersey, United States (On-Site)
2 Months ago
Red star 3d - Senior Production Manager

Red star 3d

Sheffield, England, United Kingdom (Remote)
2 Months ago
luxsoft - Senior AI/NLP Engineer

luxsoft

Chennai, Tamil Nadu, India (On-Site)
2 Weeks ago
TransUnion - Strategy Advisor, Media & Entertainment

TransUnion

Chicago, Illinois, United States (Hybrid)
2 Months ago
Octopus - Technical Account Manager

Octopus

Denmark (Remote)
3 Weeks ago
Nagarro - Senior SAP Bid Manager

Nagarro

Germany (Remote)
6 Months ago
Guardian - Senior QA Engineer - IT

Guardian

Chennai, Tamil Nadu, India (On-Site)
3 Months ago
Plug power - Field Service Technician

Plug power

Gresham, Oregon, United States (On-Site)
2 Weeks ago
NXP - Etch Equipment Maintenance Technician

NXP

Austin, Texas, United States (On-Site)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in Bangkok, Thailand

Razer - Risk & Compliance Specialist

Razer

Bangkok, Bangkok, Thailand (On-Site)
10 Months ago
Social Discovery Group - Head of Product of Premium products

Social Discovery Group

Thailand (Remote)
9 Months ago
Nordson Corporation - Technical Customer Engineer

Nordson Corporation

Bangkok, Thailand (On-Site)
2 Months ago
jumbo jumps  - Game Designer

jumbo jumps

Thailand (On-Site)
3 Weeks ago
Western Digital - Finance Planning & Variance Analyst

Western Digital

Prachin Buri, Thailand (On-Site)
4 Months ago
Grab - Senior Manager, Fleet & Rental Business

Grab

Bangkok, Thailand (On-Site)
4 Weeks ago
Western Digital - Intern - Data Science (Studying Bachelor Degree)

Western Digital

Phra Nakhon Si Ayutthaya, Thailand (On-Site)
2 Weeks ago
Sinozi - Lead QA

Sinozi

Bangkok, Thailand (On-Site)
3 Months ago
bytedance - Senior Payroll Analyst

bytedance

Bangkok, Bangkok, Thailand (On-Site)
4 Months ago
PwC - Internship program - Risk Consulting

PwC

Bangkok, Bangkok, Thailand (On-Site)
10 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Supabase - Platform Engineer: Compute & Scaling

Supabase

(Remote)
2 Months ago
Wargaming - Infrastructure Engineer

Wargaming

Vilnius, Vilnius County, Lithuania (Hybrid)
2 Weeks ago
Apple - Apple Services Engineering - Storage SRE

Apple

Elk Grove, California, United States (On-Site)
2 Months ago
Nice - DevOps Engineer

Nice

Pune, Maharashtra, India (Hybrid)
1 Month ago
Canva - Staff Software Engineer - Web Platform (Frontend)

Canva

Auckland, Auckland, New Zealand (Remote)
1 Month ago
Sword Health - Senior DevOps Engineer

Sword Health

Porto, Porto District, Portugal (Hybrid)
7 Months ago
Nice - Senior Specialist Software Engineer (Dot Net, AWS)

Nice

Pune, Maharashtra, India (Hybrid)
1 Month ago
C3 IoT - Solution Engineer

C3 IoT

Paris, Île-de-France, France (On-Site)
4 Weeks ago
binance - Senior/Staff DevOps Engineer

binance

(Remote)
2 Years ago
Abridge - AI Software Engineer Evaluation Platform

Abridge

San Francisco, California, United States (Hybrid)
1 Week ago

Get notifed when new similar jobs are uploaded

About The Company

As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries. Some 20,000 financial services and healthcare organizations, from the world's largest companies to small and mid-market firms, rely on SS&C for expertise, scale, and technology.

New York, United States (Hybrid)

Hyderabad, Telangana, India (On-Site)

Arizona, United States (Remote)

New York, United States (Hybrid)

New York, United States (Remote)

Mumbai, Maharashtra, India (Hybrid)

Hyderabad, Telangana, India (On-Site)

Bangkok, Thailand (On-Site)

Melbourne, Victoria, Australia (Hybrid)

View All Jobs

Get notified when new jobs are added by SSC Technologies

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug