Site Reliability Engineer I - India
CME Group
Job Summary
The Site Reliability Engineer (SRE) collaborates with product, business owners, and development teams to establish and manage Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs). This role involves evaluating error budgets, identifying and prioritizing activities to achieve SLOs and SLAs, and improving system reliability, performance, and capacity. SREs focus on engineering and remediating systems, automating routine functions, and supporting technology advances in CME’s electronic trading and clearing platforms, requiring expertise in software development, networking, and system engineering.
Must Have
- Establish and manage Service Level Objectives (SLOs), Service Level Indicators (SLIs), Service Level Agreements (SLAs)
- Evaluate error budgets and operational results to identify and prioritize activities to ensure SLOs and SLAs are achieved
- Improve the reliability, performance, and capacity of systems
- Engineer or remediate systems to enhance reliability, performance, and capacity
- Automate routine functions to reduce operational toil
- Work in operational areas including Metrics & Monitoring, Emergency Response, Change Management, and Capacity Planning
- Combine software development, networking, and system engineering expertise
- Build and run large-scale, distributed, fault-tolerant software systems and infrastructure
- Support technology advances in CME’s electronic trading and clearing platforms
- Achieve ultra-low latency performance, high capacity, and rock-solid reliability
- Understand underlying technology and continuous integration/delivery of applications
- Solve problems creatively using industry standards
- Develop automated systems for scale and speed
Job Description
The Site Reliability Engineer (SRE) works in an operational capacity, collaborating with product, business owners and development teams to establish and manage Service Level Objectives (SLOs), Service Level Indicators (SLI’s), Service Level Agreements (SLAs), while evaluating error budgets and operational results to identify and prioritize activities to ensure SLOs and SLAs are achieved. SREs work to improve the reliability, performance, and capacity of systems. SREs spend much of their time engineering or remediating systems to enhance reliability, performance, and capacity, while automating routine functions to reduce operational toil.
SREs operational areas include Metrics & Monitoring, Emergency Response, Change Management and Capacity Planning.
This role combines software development, networking and system engineering expertise to build and run large-scale, distributed, fault-tolerant software systems and infrastructure. This SRE role is critical in supporting technology advances in CME’s electronic trading and clearing platforms. This individual will work on systems that must achieve a unique blend of ultra-low latency performance, the capacity to seamlessly facilitate the busiest trading days in the world economy, and rock-solid reliability and integrity, all while undergoing rapid release cycles. Achieving these goals will require an understanding of both the underlying technology and the continuous integration and delivery of the applications. The candidate must be able to solve problems creatively using industry standards, communicate effectively, and possess the ability to lead others to achieve the critical mission of the team. The individual is heavily involved in the development of automated systems to enable the scale and speed necessary to deliver high-performing application