Site Reliability Engineer III (Hospitality Solutions)

Sabre India

2+ Years | Fort Worth, TX, United States (Hybrid) | Full Time | 2 months ago

Apply Now

Job Summary

Hospitality Solutions, formerly part of Sabre Holdings, is seeking a Site Reliability Engineer III. This role involves combining software and systems engineering to build and run large-scale, distributed, fault-tolerant systems. The SRE will ensure services are reliable, performant, and continuously improving, with a strong emphasis on automation, operational excellence, and a collaborative learning culture. The position requires a passion for data, deep diving into operational insights, and shaping customer support and infrastructure sizing.

Must Have

Strong hands-on experience with Linux platforms.
Highly skilled in AWS and/or GCP platforms, Terraform, and core cloud concepts.
Proficiency in automation and scripting (Shell scripting, Python).
Understanding of TCP/IP and HTTP protocols and ability to debug network issues.
Administration and subject matter expertise in Splunk or other Observability tools.
Ability to apply statistical and data analysis techniques to operational metrics.
Ensure operational readiness and reliability of systems and services.
Develop and automate reliability-focused solutions, including SLOs/SLIs.
Lead postmortem and root cause analysis (RCA) investigations.
Participate in on-call rotations and support a 24x7 global environment.
Provide technical and product support for hosted solutions, including troubleshooting.
Manage day-to-day monitoring and automation tasks.
Minimum of 2+ years of professional experience in SRE, IT Operations, or DevOps.

Good to Have

Basic knowledge of Oracle and SQL.
Experience with DevOps tools (Jenkins, Ansible, orchestration tools).
Experience with PowerShell scripting.
Experience with AppDynamics.
Familiarity with Python libraries such as numpy, scipy, pandas, scikit-learn, and statsmodels.
Experience using AI models or Machine Learning in the context of AIOps.
Familiarity with ITIL and Change Management (ServiceNow).

Perks & Benefits

Very competitive compensation
Generous Paid Time Off (25 PTO days)
4 days (one day/quarter) Volunteer Time Off (VTO)
5 days off annually for Year-End Break
Comprehensive medical, dental and Wellness Program
12 weeks paid parental leave
Flexible working arrangements
Formal and informal reward, recognition and acknowledgement programs
Fun and engaging employee development events

Job Description

Sabre is a technology company that powers the global travel industry. By leveraging next-generation technology, we create global technology solutions that take on the biggest opportunities and solve the most complex challenges in travel.

Positioned at the center of the travel, we shape the future by offering innovative advancements that pave the way for a more connected and seamless ecosystem as we power mobile apps, online travel sites, airline and hotel reservation networks, travel agent terminals, and scores of other solutions.

Simply put, we connect people with moments that matter.

NOTE: TPG Capital, a global alternative asset management firm, recently acquired Hospitality Solutions. Over the coming months, Sabre is working with TPG to formally separate the Hospitality Solutions business from Sabre. It is important to understand that while you will be employed by a Sabre legal entity, your role will be to support the Hospitality Solutions business, which is now owned by TPG.

Hospitality Solutions, formerly part of Sabre Holdings, is a global leader at the forefront of hospitality technology powering over 40,000 properties across 174 countries. Celebrated for our innovative and customer-centric approach, we deliver integrated platforms for distribution, reservations, retailing, and guest experience to both renowned hotel brands and independent properties worldwide.

Job Overview

Site Reliability Engineering (SRE) at Hospitality Solutions combines software and systems engineering to build and run large-scale, distributed, fault-tolerant systems. SREs ensure our services—both internal and external—are reliable, performant, and continuously improving. The role emphasizes automation, operational excellence, and a culture of collaboration and learning.

The candidate needs to have an absolute love for data and desire to deep dive into it. You will be helping to shape how we support our customers, size our infrastructure and drive insights from Operational information.

Technical Skills (Required):

Strong hands-on experience with Linux platforms.
Highly skilled in AWS and/or GCP platforms, Terraform, and core cloud concepts.
Proficiency in automation and scripting (Shell scripting, Python).
Understanding of TCP/IP and HTTP protocols and ability to debug network issues.
Administration and subject matter expertise in Splunk (preferred), or other Observability tool such as DataDog, Dynatrace, or New Relic.
Must be able to apply statistical and data analysis techniques to operational metrics in Splunk for advanced analytics and actionable insights.

Technical Skills (Nice to Have):

It is helpful if you have basic knowledge of Oracle and SQL.
Experience with DevOps tools (Jenkins, Ansible, orchestration tools).
Experience with PowerShell scripting is a plus.
Experience with AppDynamics is preferred but not required.
Familiarity with Python libraries such as numpy, scipy, pandas, scikit-learn, and statsmodels is a plus.
Experience using AI models or Machine Learning in the context of AIOps is desirable.
Familiarity with ITIL and Change Management (ServiceNow) is a plus.

Operational Responsibilities (Required):

Ensure operational readiness and reliability of systems and services.
Administer and provide expertise in observability tools.
Develop and automate reliability-focused solutions, including SLOs/SLIs.
Lead postmortem and root cause analysis (RCA) investigations.
Participate in on-call rotations and support a 24x7 global environment.
Plan capacity, tune performance, and optimize costs.
Provide technical and product support for hosted solutions, including troubleshooting and debugging.
Manage day-to-day monitoring and automation tasks.
Support planning, implementation, deployment, and measurement of operational assets.

Communication & Leadership:

Effectively communicate project status, incidents, problems, and root causes to stakeholders and management.
Lead and manage complex challenges at scale.
You must have intellectual curiosity, problem solving, and openness skills.
You will provide support and mentorship to team members.

Qualifications:

A minimum of 2+ years of professional experience in SRE, IT Operations, or DevOps.
A Bachelor’s degree in computer science is preferred but not required

Outstanding Benefits

Very competitive compensation
Generous Paid Time Off (25 PTO days)
4 days (one day/quarter) Volunteer Time Off (VTO)
5 days off annually for Year-End Break
We offer a comprehensive medical, dental and Wellness Program
12 weeks paid parental leave
An infrastructure that allows flexible working arrangements
Formal and informal reward, recognition and acknowledgement programs
Lots of fun and engaging employee development events

Reasonable Accommodation

Sabre is committed to working with and providing reasonable accommodation to applicants with disabilities. Applicants applying for a Sabre position with a disability who require a reasonable accommodation for any part of the application or hiring process may contact Sabre at recruiting@careers.sabre.com

Determinations on requests for reasonable accommodation will be made on a case-by-case basis.

Affirmative Action

Sabre is an equal employment opportunity/affirmative action employer and is committed to providing employment opportunities to minorities, females, veterans and disabled individuals. EEO IS THE LAW

#LI-Hybrid#LI-TJ1

19 Skills Required For This Role

Problem Solving Data Analytics Oracle Game Texts Linux Aws Ansible Terraform New Relic Powershell Numpy Scikit Learn Pandas Python Shell Splunk Sql Jenkins Machine Learning

Similar Jobs

Devops

DevSecOps Engineer

Omitron • Colorado Springs, Colorado, United States (On Site)