Site Reliability Engineer (SRE), Cloud Incident Response

1 Month ago • 5 Years + • Devops

Job Summary

Job Description

As a Site Reliability Engineer (SRE) you will be part of a global team, ensuring the performance, scalability, and reliability of critical cloud-based applications. You will collaborate with global teams, respond to and resolve application incidents, monitor applications using tools like Prometheus and Grafana, create and maintain dashboards, define and track SRE metrics, partner with development teams, analyze incident trends, automate support tasks, and participate in post-incident reviews. The role involves working in a fast-paced environment and contributing to the improvement of the platform's reliability and efficiency.
Must have:
  • Bachelor’s degree in related field.
  • 5+ years of experience or for junior roles fresh graduates are welcome.
  • Proficiency in programming languages, preferably Java, JavaScript or Python.
  • Ability to troubleshoot complex systems.
  • Skilled in debugging, code optimization, and automation.
  • Experience with relational databases and data analysis.
Good to have:
  • Experience working in Site Reliable Engineer (SRE) roles.
  • Hands-on experience with cloud infrastructure, preferably AWS.
  • Familiarity with observability tools such as Grafana, ELK Stack.
  • Experience deploying and managing applications on Kubernetes platforms.
  • Strong skills in analyzing and troubleshooting issues in large-scale, distributed systems.
Perks:
  • Hybrid Work Model and Business Casual Dress Code.
  • Retirement Program, Professional Development Reimbursement
  • Flexible Personal/Vacation Time Off, Sick Leave, Paid Holidays, Business Leave, Maternity Leave, Ordination Leave
  • Medical, Dental, Vision, Life Insurance, Annual Health Check Up, Employee Assistance Program, Parental Leave, Well-Stocked Pantry and Provident Fund Contribution
  • Committed to Welcoming, Celebrating and Thriving on Diversity
  • Hands-On, Team-Customized, including SS&C University
  • Paid further education opportunities for employees who are eligible
  • Bonus Scheme, SS&C Stock(s) Allocation for employees who are eligible
  • Welfare Committee: Discounts on fitness clubs, travel and more!

Job Details

As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries. Some 20,000 financial services and healthcare organizations, from the world's largest companies to small and mid-market firms, rely on SS&C for expertise, scale, and technology.

Job Description

Overall job purpose:

Be part of a global team that ensures the performance, scalability, and reliability of critical cloud-based applications. As part of the Global Investor and Distribution Solutions (GIDS) Platform Services team, you’ll play a key role in keeping our systems running smoothly and efficiently—while helping shape the future of our platform.

What You’ll Do:

  • Collaborate with global teams as part of a follow-the-sun support model.
  • Respond to, troubleshoot, and resolve Level 2 application incidents.
  • Ensure critical applications are effectively monitored using tools like Prometheus and Grafana.
  • Create and maintain dashboards and alerts to enhance visibility into application health.
  • Define, implement, and track key SRE metrics (SLOs, SLIs, error budgets).
  • Partner with development teams to improve application reliability and resilience.
  • Analyze incident trends and recommend improvements to reduce recurrence.
  • Automate repetitive support tasks to improve efficiency.
  • Participate in post-incident reviews and drive reliability initiatives.

Qualifications:

Minimum Qualification

  • Bachelor’s degree in Computer Science, Computer Engineering, IT, or related field.
  • 5+ years of experience for senior roles; fresh graduates welcome for junior roles.
  • Proficiency in one or more programming languages, preferably Java, JavaScript or Python.
  • Proven ability to troubleshoot complex systems.
  • Skilled in debuggingcode optimization, and automation.
  • Experience with relational databases and data analysis.

Highly Preferred

  • Experience working in Site Reliable Engineer (SRE) roles or incident response environments.
  • Hands-on experience with cloud infrastructure, preferably AWS.
  • Familiarity with observability tools such as Grafana, ELK Stack, or similar.
  • Experience deploying and managing applications on Kubernetes platforms.
  • Strong skills in analyzing and troubleshooting issues in large-scale, distributed systems.

Why You Will Love It Here!

  • Hybrid Work Model and Business Casual Dress Code, including jeans, Centralized location – 6 minutes’ walk from Phromphong BTS or 10 minutes’ walk from Sukhunvit MRT
  • Your Future: Retirement Program, Professional Development Reimbursement  
  • Work/Life Balance: Flexible Personal/Vacation Time Off, Sick Leave, Paid Holidays, Business Leave, Maternity Leave, Ordination Leave
  • Your Wellbeing: Medical, Dental, Vision, Life Insurance, Annual Health Check Up, Employee Assistance Program, Parental Leave, Well-Stocked Pantry and Provident Fund Contribution
  • Diversity & Inclusion: Committed to Welcoming, Celebrating and Thriving on Diversity
  • Hands-On, Team-Customized, including SS&C University
  • Paid further education opportunities for employees who are eligible
  • Extra Perks: Bonus Scheme, SS&C Stock(s) Allocation for employees who are eligible
  • Welfare Committee: Discounts on fitness clubs, travel and more!

#LI-NW1

#CA-NW

Unless explicitly requested or approached by SS&C Technologies, Inc. or any of its affiliated companies, the company will not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services.

SS&C Technologies is an Equal Employment Opportunity employer and does not discriminate against any applicant for employment or employee on the basis of race, color, religious creed, gender, age, marital status, sexual orientation, national origin, disability, veteran status or any other classification protected by applicable discrimination laws.

Similar Jobs

Zeeco, Inc. - Graduate/Trainee Field Service & Commissioning Engineer/Technician

Zeeco, Inc.

Stamford, England, United Kingdom (On-Site)
3 Months ago
Tesla - Field Service Technician, Industrial Storage - Supercharging

Tesla

Swadzim, Greater Poland Voivodeship, Poland (On-Site)
4 Months ago
Luxoft - Senior Software Support Engineer

Luxoft

Ukrainka, Kyiv Oblast, Ukraine (Remote)
7 Months ago
Epic Games - Technical Artist, Developer Relations (Metahuman)

Epic Games

Cary, North Carolina, United States (On-Site)
2 Months ago
WebMD - Sr. Marketing Specialist

WebMD

Newark, New Jersey, United States (On-Site)
1 Month ago
Rippling - Staff Platform Engineer (Backend) - HRIS Platform

Rippling

San Francisco, California, United States (On-Site)
7 Months ago
Workato - Senior Infrastructure Engineer (OpenSearch)

Workato

Sofia, Sofia City Province, Bulgaria (On-Site)
1 Month ago
GoReel - DevOps Lead

GoReel

Bratislava Region, Slovakia (Remote)
3 Months ago
Milestone - Solutions Engineer

Milestone

Germany (Hybrid)
1 Month ago
Toast - Workday Solutions Architect

Toast

United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Vigaet - Internship-Full stack Developer

Vigaet

Bengaluru, Karnataka, India (On-Site)
1 Year ago
Entain group - Cloud Engineer

Entain group

Australia (Remote)
2 Weeks ago
Varonis  - MDDR Manager

Varonis

Morrisville, North Carolina, United States (Hybrid)
3 Weeks ago
CyberArk - Senior Software Engineer (.NET and React)

CyberArk

India (On-Site)
1 Week ago
Apple - GGML Bringup and Triage Engineer

Apple

Cupertino, California, United States (On-Site)
2 Months ago
Tencent - Senior Cloud Solution Architect

Tencent

California, United States (On-Site)
3 Months ago
Qualcomm - CPU Emulation Engineer Multiple Levels

Qualcomm

Bengaluru, Karnataka, India (On-Site)
2 Months ago
Tekion Corp - Learning Operation Specialist II

Tekion Corp

Bengaluru, Karnataka, India (On-Site)
3 Weeks ago
bytedance - Video Codec Algorithm Engineer - Multimedia Lab

bytedance

Seattle, Washington, United States (On-Site)
8 Months ago
Enphase Energy - Solargraf Sr. QA Engineer

Enphase Energy

Bengaluru, Karnataka, India (On-Site)
7 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Bangkok, Thailand

USE Insider - Account Director

USE Insider

Bangkok, Thailand (Hybrid)
2 Weeks ago
Sony Pictures Entertainment - Assistant Manager, Advertising and Publicity

Sony Pictures Entertainment

Bangkok, Thailand (On-Site)
1 Month ago
binance - Big Data Engineer (Java Spring Boot, Flink)

binance

Bangkok, Thailand (Remote)
4 Years ago
Valeo - Electronics Engineer

Valeo

Chon Buri, Thailand (On-Site)
1 Month ago
NielsenIQ - Data Coding / Data Processing Specialist (Reference Data)

NielsenIQ

Bangkok, Thailand (On-Site)
1 Month ago
Grab - Associate, Driver Communication Management

Grab

Bangkok, Thailand (On-Site)
1 Month ago
bohemia interactive - E-commerce Customer Support

bohemia interactive

Bangkok, Bangkok, Thailand (On-Site)
4 Months ago
AFRY - Green Hydrogen and Ammonia Consultant/Senior Consultant

AFRY

Bangkok, Thailand (On-Site)
5 Months ago
PwC - Cyber Manager

PwC

Bangkok, Bangkok, Thailand (On-Site)
9 Months ago
PwC - Capital Projects & Infrastructure - Senior Associate (Open)

PwC

Bangkok, Bangkok, Thailand (On-Site)
9 Months ago

Get notifed when new similar jobs are uploaded

Devops Jobs

Zuora - Customer Solution Engineer

Zuora

Tokyo, Japan (On-Site)
1 Month ago
NVIDIA - Senior Site Reliability Engineer - AI Research Clusters

NVIDIA

Santa Clara, California, United States (Hybrid)
5 Months ago
luxsoft - Senior/Lead DevOps Engineer

luxsoft

Indore, Madhya Pradesh, India (On-Site)
1 Week ago
bytedance - CDN Senior Site Reliability Engineer - Traffic Infrastructure

bytedance

Singapore (On-Site)
8 Months ago
Scorewarrior - Senior Deployment Engineer

Scorewarrior

Limassol, Limassol, Cyprus (On-Site)
2 Days ago
London stock Exchange - Senior Engineer, Application Management(DevOps)

London stock Exchange

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Canva - Senior Frontend Engineer - Apps API Platform

Canva

Brisbane, Queensland, Australia (Remote)
1 Month ago
Dream Sports - SDE - 1 - DevOps

Dream Sports

Mumbai, Maharashtra, India (On-Site)
8 Months ago
attentive - Staff Site Reliability Engineer

attentive

United States (Remote)
2 Weeks ago
Glean - Partner Solutions Engineer

Glean

Japan (Remote)
1 Month ago

Get notifed when new similar jobs are uploaded

About The Company

As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries. Some 20,000 financial services and healthcare organizations, from the world's largest companies to small and mid-market firms, rely on SS&C for expertise, scale, and technology.

Los Angeles, California, United States (Hybrid)

Gurugram, Haryana, India (On-Site)

Bangkok, Thailand (Hybrid)

New York, United States (Hybrid)

Melbourne, Victoria, Australia (Hybrid)

Mumbai, Maharashtra, India (On-Site)

Mumbai, Maharashtra, India (On-Site)

London, England, United Kingdom (Hybrid)

Dallas, Texas, United States (Hybrid)

Basildon, England, United Kingdom (On-Site)

View All Jobs

Get notified when new jobs are added by SSC Technologies

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug