Staff Site Reliability Engineer

59 Minutes ago • All levels
Devops

Job Description

CME Group is seeking a Staff SRE to build, operate, and scale systems in our Markets portfolio, specifically for the Globex trading platform. This role involves leading product direction for improving reliability, shaping the roadmap, architecture, and driving high-impact changes across teams. The successful candidate will have a strong understanding of SRE principles, enjoy operating Production systems, and be a strong communicator.
Good To Have:
  • Java coding capabilities
  • Experience working on financial applications and trading platforms in capital markets
  • Experience working on ultra-low latency (ULL) platform
  • Experience working in an agile environment
Must Have:
  • Serve as the technical leader for Product reliability - defining a Product Reliability Roadmap and influencing decisions on direction and prioritisation
  • Define, design and lead the implementation of Service-Level Indicators (SLIs) and Service-Level Objectives (SLOs), alongside appropriate observability and monitoring
  • Work alongside lead product engineers to design testing for reliability, performance, capacity and DR
  • Lead reliability delivery for the team, assuming accountability while managing risks and dependencies, and ensuring Product leadership are proactively updated
  • Participate in on-call and act as an escalation to others; steps in to act as an Operational Lead in major incident response
  • Lead post-incident analyses and work with stakeholders to prioritise both tactical and strategic improvements
  • Apply a continuous improvement mindset, identify reliability process improvements and work with Product leaders to influence change and adoption of team process
  • Improve reliability, quality, and time-to-market through removal of toil and seizing opportunities to shift-left etc.
  • Participate in DR testing and continuously improve
  • Lead Production review meetings based on SLOs, error budgets and incident data and ensure outcomes are decided and prioritised
  • Represent SRE in architecture decisions with reliability and resiliency a priority
  • Mentor other engineers in SRE principles, championing a culture of “SRE as a practice”
  • Map usage to capacity to costs, while ensuring no impacts to reliability
  • Support the migration of markets applications to Google Cloud Platform (GCP), ensuring a seamless transition
  • Stay informed on emerging technologies & latest industry trends, and recognise opportunities for CME Group
  • Develop POCs which can be adopted and reused across the organisation
  • A highly accountable and collaborative person with demonstrated ability to influence change and proven track record of leading large-scale changes
  • Excellent communication and teamwork skills; someone with strong stakeholder management who is able to communicate effectively across disciplines and across regions
  • Experience working in an SRE role
  • Experience with Linux-based systems & with Cloud-based platform(s)
  • Strong knowledge of application architectures, messaging middleware, and network protocols
  • Strong coding capabilities (scripting languages Python, Bash, Ansible, Terraform, etc.)
  • Experience with monitoring and observability tools such as OpenTelemetry, Splunk, Prometheus, Grafana, etc
  • Experience automating CI/CD processes and solutions
  • A growth mindset; eagerness to learn and adapt in a fast-paced trading environment.
  • Understanding of current and emerging technologies
Perks:
  • Be part of a global leader in financial services technology
  • Work on cutting-edge technology in a collaborative and innovative culture
  • Competitive compensation and benefits package
  • Opportunity to grow and advance your career in SRE with an organisation who is transforming to this approach

Add these skills to join the top 1% applicants for this job

team-management
communication
game-texts
agile-development
incident-response
linux
prometheus
ansible
terraform
grafana
google-cloud-platform
ci-cd
python
splunk
bash
java

About us:

CME Group is seeking a Staff SRE to help, build, operate and scale systems in our Markets portfolio. Markets SREs work on products and applications related to CME’s Globex trading platform. Our systems deliver an exceptional combination of low-latency performance and rock-solid reliability to seamlessly handle the world’s busiest trading days.

The successful candidate will have a strong understanding of SRE principles and practices, enjoy the cut-and-thrust of operating Production systems, be a strong communicator, and may have previously worked in an SRE role, a software engineering role, a DevOps role or a systems engineering role.

About the role:

As a Staff SRE you'll lead Product direction for improving reliability. You will shape our roadmap, architecture and drive high-impact changes across teams.

Key responsibilities:

  • Serve as the technical leader for Product reliability - defining a Product Reliability Roadmap and influencing decisions on direction and prioritisation
  • Define, design and lead the implementation of Service-Level Indicators (SLIs) and Service-Level Objectives (SLOs) that truly reflect customer experience, alongside appropriate observability and monitoring
  • Work alongside lead product engineers to design testing for reliability, performance, capacity and DR
  • Lead reliability delivery for the team, assuming accountability while managing risks and dependencies, and ensuring Product leadership are proactively updated
  • Participate in on-call and act as an escalation to others; steps in to act as an Operational Lead in major incident response - demonstrating urgency while remaining calm and considered
  • Lead post-incident analyses and work with stakeholders to prioritise both tactical and strategic improvements
  • Apply a continuous improvement mindset, identify reliability process improvements and work with Product leaders to influence change and adoption of team process
  • Improve reliability, quality, and time-to-market through removal of toil and seizing opportunities to shift-left etc.
  • Participate in DR testing and continuously improve
  • Lead Production review meetings based on SLOs, error budgets and incident data and ensure outcomes are decided and prioritised
  • Represent SRE in architecture decisions with reliability and resiliency a priority
  • Mentor other engineers in SRE principles, championing a culture of “SRE as a practice”
  • Map usage to capacity to costs, while ensuring no impacts to reliability
  • Support the migration of markets applications to Google Cloud Platform (GCP), ensuring a seamless transition
  • Stay informed on emerging technologies & latest industry trends, and recognise opportunities for CME Group
  • Develop POCs which can be adopted and reused across the organisation

What We’re Looking for:

  • A highly accountable and collaborative person with demonstrated ability to influence change and proven track record of leading large-scale changes
  • Excellent communication and teamwork skills; someone with strong stakeholder management who is able to communicate effectively across disciplines and across regions
  • Experience working in an SRE role
  • Experience with Linux-based systems & with Cloud-based platform(s) - Google Cloud Platform, GCE, and/or GKE a bonus
  • Strong knowledge of application architectures, messaging middleware, and network protocols
  • Strong coding capabilities (scripting languages Python, Bash, Ansible, Terraform, etc.); Java a bonus
  • Experience with monitoring and observability tools such as OpenTelemetry, Splunk, Prometheus, Grafana, etc
  • Experience automating CI/CD processes and solutions
  • A growth mindset; eagerness to learn and adapt in a fast-paced trading environment.
  • Understanding of current and emerging technologies

Desirable:

  • Experience working on financial applications and trading platforms in capital markets is a bonus
  • Experience working on ultra-low latency (ULL) platform a bonus
  • Experience working in an agile environment

Why CME Group:

  • Be part of a global leader in financial services technology
  • Work on cutting-edge technology in a collaborative and innovative culture
  • Competitive compensation and benefits package
  • Opportunity to grow and advance your career in SRE with an organisation who is transforming to this approach

Join CME Group and play a crucial role in ensuring the stability and performance of our Markets applications while contributing to the migration to Google Cloud Platform. Apply now to be a part of our dynamic SRE team!

Set alerts for more jobs like Staff Site Reliability Engineer
Set alerts for new jobs by CME Group
Set alerts for new Devops jobs in United Kingdom
Set alerts for new jobs in United Kingdom
Set alerts for Devops (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙