Head of Production Engineering & Site Reliability Engineering (SRE)

1 Month ago • 10 Years + • Product Management

Job Summary

Job Description

As the Head of Production Engineering and Site Reliability Engineering (SRE) for the GIDS organisation, you will lead a team responsible for the scalability, resilience, performance, and reliability of cloud and hybrid infrastructure powering some of the most critical client-facing applications in financial services. You will define the vision and roadmap for Production Engineering and SRE within GIDS, build and lead globally distributed, high-performance teams, and collaborate with various teams to improve system reliability and efficiency. This role involves owning reliability, uptime, and performance KPIs, implementing incident management, and overseeing CI/CD pipelines. The responsibilities also include leading observability implementation, ensuring security and compliance, and managing infrastructure and automation. The position requires attracting, retaining, and mentoring top engineering talent, fostering a culture of ownership, and driving career development.
Must have:
  • 10+ years of engineering experience, 5+ in SRE/DevOps leadership
  • Proven track record managing reliable, scalable systems
  • Strong understanding of modern software development lifecycle and cloud technologies
  • Expertise in Kubernetes, AWS, GitOps, observability, and automation
  • Excellent leadership, communication, and stakeholder management skills
Good to have:
  • Familiarity with ISO/SOC2/GDPR compliance frameworks and evidence collection automation

Job Details

As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries. Some 20,000 financial services and healthcare organizations, from the world's largest companies to small and mid-market firms, rely on SS&C for expertise, scale, and technology.

Job Description

About SS&C Technologies

SS&C is a global provider of investment and financial software-enabled services and software for the global financial services and healthcare industries. The GIDS product suite powers mission-critical investor and distributor services across asset managers, insurance companies, retirement providers, and wealth management platforms.

Job Overview

As the Head of Production Engineering and Site Reliability Engineering (SRE) for the GIDS organisation, you will lead a team responsible for the scalability, resilience, performance, and reliability of cloud and hybrid infrastructure powering some of the most critical client-facing applications in financial services.

You will be the strategic and operational leader for platform reliability, observability, incident response, CI/CD modernisation, and developer productivity.

Why Join SS&C GIDS?

  • Lead mission-critical infrastructure for a globally recognised financial technology provider.
  • Influence the technical direction of a high-impact product suite.
  • Build a modern engineering organisation with a strong culture of innovation, ownership, and reliability.

Key Responsibilities

Leadership & Strategy

  • Define and execute the vision and roadmap for Production Engineering and SRE within GIDS.
  • Build and lead globally distributed, high-performance teams with a focus on talent development, SRE culture, and operational excellence.
  • Collaborate cross-functionally with Engineering, Product, Compliance, and Infrastructure teams to improve system reliability and efficiency.

Production Operations & Incident Management

  • Own reliability, uptime, and performance KPIs for GIDS applications and services.
  • Implement a comprehensive incident management lifecycle (on-call, escalation, RCA, blameless postmortems).
  • Reduce Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR) through automated observability, alerting, and playbooks.

CI/CD and Platform Engineering

  • Oversee the development and evolution of CI/CD pipelines for all GIDS products using GitHub Actions, ArgoCD, TeamCity, Octopus Deploy, and GitOps principles.
  • Integrate static and dynamic code analysis, vulnerability scanning, artifact promotion, and release gating into the SDLC.
  • Ensure pipeline scalability and governance while maintaining developer velocity.

Observability & Troubleshooting

  • Lead the implementation and usage of modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana, Splunk, Datadog).
  • Establish SLOs, SLIs, and error budgets with product and engineering teams.
  • Drive root cause identification using distributed tracing, advanced log analysis, and anomaly detection.

Security, Audit & Compliance

  • Partner with security and compliance teams to embed controls into infrastructure and software delivery.
  • Automate audit evidence collection, change tracking, and access management (e.g., HashiCorp Vault, OPA, AWS IAM).
  • Ensure all systems meet internal and regulatory audit requirements (SOC2, GDPR, etc.).

Infrastructure & Automation

  • Champion infrastructure-as-code (IaC) using Terraform, Helm, and Kubernetes for scalable cloud and hybrid deployments.
  • Optimise infrastructure cost, elasticity, and resilience through autoscaling, canary deployments, and chaos testing.
  • Maintain high SLAs for critical services running on Kubernetes, AWS, and on-prem hybrid infrastructure.

Talent Management & Culture

  • Attract, retain, and mentor top engineering talent with a strong focus on diversity and continuous learning.
  • Cultivate a culture of ownership, transparency, blameless accountability, and operational excellence.
  • Drive career development through structured learning paths, performance reviews, and skills-based mentoring.

Talent Management & Global Operations

  • Build and scale a globally distributed 24/7 operations team, ensuring consistent coverage and operational resilience.
  • Establish and enforce engineering and operational standards for deployments, monitoring, and incident response across geographies.
  • Implement and continuously refine a multi-tiered support structure (L1, L2, L3) with clear escalation paths and accountability.
  • Drive hiring, onboarding, and training initiatives that support both site reliability and continuous delivery.
  • Foster a strong engineering culture rooted in transparency, autonomy, learning, and operational excellence.
  • Develop strategies to prevent burnout in around-the-clock operations, including tooling, automation, and shift rotation planning.

Qualifications

Required:

  • 10+ years of experience in engineering, with 5+ years in a leadership role in SRE, DevOps, or Production Engineering.
  • Proven track record managing reliable, scalable systems in a high-compliance environment (e.g., FinTech, HealthTech).
  • Strong understanding of modern software development lifecycle, CI/CD, IaC, and cloud-native technologies.
  • Expertise in Kubernetes, AWS (or Azure/GCP), GitOps workflows, observability tools, and automation frameworks.
  • Excellent leadership, communication, and stakeholder management skills.
  • Certifications: AWS Certified Solutions Architect, CKA/CKAD, or relevant DevOps/SRE certs.
  • Familiarity with ISO/SOC2/GDPR compliance frameworks and evidence collection automation.

We encourage applications from people of all backgrounds and particularly welcome applications from under-represented groups, to enable us to bring a diversity of perspectives to our thinking and conversation. It's important to us that we strive to have a workforce that is diverse in the widest sense.

Unless explicitly requested or approached by SS&C Technologies, Inc. or any of its affiliated companies, the company will not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services.

SS&C Technologies is an Equal Employment Opportunity employer and does not discriminate against any applicant for employment or employee on the basis of race, color, religious creed, gender, age, marital status, sexual orientation, national origin, disability, veteran status or any other classification protected by applicable discrimination laws.

Similar Jobs

USE Insider - Sales Manager

USE Insider

Bogota, Colombia (Hybrid)
2 Months ago
USE Insider - Learning and Development Manager

USE Insider

Istanbul, İstanbul, Türkiye (Hybrid)
8 Months ago
Minecast - Enterprise Account Executive

Minecast

Australia (On-Site)
1 Month ago
warner bros games - Sr Analyst FP&A

warner bros games

Mexico City, Mexico City, Mexico (Hybrid)
4 Months ago
Vercel - Content Engineer

Vercel

United States (Remote)
1 Month ago
attentive - Senior Product Manager, Enterprise Email

attentive

New York, New York, United States (Hybrid)
5 Months ago
Say games - Producer

Say games

Limassol, Limassol, Cyprus (Hybrid)
1 Month ago
Valeo - Production Supervisor

Valeo

Campinas, State Of São Paulo, Brazil (On-Site)
1 Month ago
NXP - Wireless Connectivity Product Manager

NXP

San Jose, California, United States (On-Site)
1 Month ago
Scopely - Lead Product Manager, Growth

Scopely

United States (Remote)
3 Weeks ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

AECOM - BIM Technician

AECOM

Barcelona, Catalonia, Spain (Hybrid)
1 Month ago
AGS - American Gaming Systems - Finance Manager

AGS - American Gaming Systems

Las Vegas, Nevada, United States (On-Site)
2 Months ago
ComeOn Group - Brand Executive - Denmark

ComeOn Group

St. Julian's, Malta (Hybrid)
3 Months ago
DraftKings - Senior Python Developer

DraftKings

London, England, United Kingdom (On-Site)
2 Months ago
JDA - Senior Specialist Employee Engagement & Internal Communications

JDA

Las Vegas, Nevada, United States (On-Site)
2 Weeks ago
Trellix - Customer Success Manager

Trellix

Riyadh, Riyadh Province, Saudi Arabia (On-Site)
3 Weeks ago
appier - Campaign Analyst (US)

appier

Taipei City, Taiwan (On-Site)
1 Month ago
Zones - Senior Performance & Workforce Analytics Specialist

Zones

Islamabad, Islamabad Capital Territory, Pakistan (On-Site)
3 Months ago
Kinetik - QA Engineer II

Kinetik

Bangladesh (On-Site)
1 Month ago
Oliver Agency - Junior Copywriter

Oliver Agency

Mumbai, Maharashtra, India (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Jobs in London, England, United Kingdom

Google - 2D Artist / Generalist

Google

London, England, United Kingdom (On-Site)
1 Month ago
Cloud Imperium Games - Senior Environment Artist - Spacescape Specialist / Worldbuilder

Cloud Imperium Games

Manchester, England, United Kingdom (On-Site)
3 Weeks ago
King - Senior Ad Sales Executive

King

London, England, United Kingdom (On-Site)
1 Month ago
Haptic  - Head of Product

Haptic

United Kingdom (Remote)
5 Months ago
Kwalee - DevSecOps Engineer

Kwalee

Royal Leamington Spa, England, United Kingdom (On-Site)
1 Month ago
Rolls-Royce - Component Design Engineer - Mechanical Control Units

Rolls-Royce

Solihull, England, United Kingdom (Hybrid)
1 Month ago
Monzo - Senior Product Control Analyst

Monzo

London, England, United Kingdom (Hybrid)
2 Weeks ago
Blacktree Gaming - Customer Support Specialist

Blacktree Gaming

United Kingdom (Remote)
4 Days ago
Alpha Sense - Senior Director of Strategic Initiatives, Global Markets

Alpha Sense

London, England, United Kingdom (Remote)
1 Month ago
playground - Senior Gameplay Engineer

playground

Royal Leamington Spa, England, United Kingdom (Hybrid)
2 Months ago

Get notifed when new similar jobs are uploaded

Product Management Jobs

Sika Group - Production Supervisor (Evening)

Sika Group

Boisbriand, Quebec, Canada (On-Site)
3 Weeks ago
Rippling - Senior Software Engineer - IT Product

Rippling

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Figma - Workforce & Vendor Manager, Product Support

Figma

San Francisco, California, United States (Remote)
2 Weeks ago
Meta - Network Production Engineer

Meta

Menlo Park, California, United States (On-Site)
7 Months ago
Flow - Senior Product Manager - Consumer

Flow

New York, New York, United States (On-Site)
1 Month ago
Paytm - Senior Product Manager - Travel

Paytm

Noida, Uttar Pradesh, India (On-Site)
3 Weeks ago
Glean - Product Manager, Glean for Engineering

Glean

Palo Alto, California, United States (Hybrid)
1 Month ago
Riot Games - Content Producer II

Riot Games

Berlin, Berlin, Germany (On-Site)
2 Months ago
Zones - Product Manager - Stadium (Baseball/Softball)

Zones

Stamford, Connecticut, United States (On-Site)
1 Month ago
Rippling - Product Lead, Time Products

Rippling

New York, United States (On-Site)
1 Week ago

Get notifed when new similar jobs are uploaded

About The Company

As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries. Some 20,000 financial services and healthcare organizations, from the world's largest companies to small and mid-market firms, rely on SS&C for expertise, scale, and technology.

San Francisco, California, United States (On-Site)

Los Angeles, California, United States (Hybrid)

Hyderabad, Telangana, India (On-Site)

Gurugram, Haryana, India (On-Site)

Bangkok, Thailand (Hybrid)

Bangkok, Thailand (Hybrid)

New York, United States (Hybrid)

Melbourne, Victoria, Australia (Hybrid)

Mumbai, Maharashtra, India (On-Site)

Mumbai, Maharashtra, India (On-Site)

View All Jobs

Get notified when new jobs are added by SSC Technologies

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug