Sabre Corporation
Site Reliability Engineer (Hospitality Solutions)
Sabre Corporation, Southlake, Texas, United States, 76092
Site Reliability Engineer (Hospitality Solutions)
Site Reliability Engineer (Hospitality Solutions) role at Sabre Corporation.
Overview Site Reliability Engineering (SRE) at Hospitality Solutions combines software and systems engineering to build and run large-scale, distributed, fault-tolerant systems. SREs ensure our services—both internal and external—are reliable, performant, and continuously improving. The role emphasizes automation, operational excellence, and a culture of collaboration and learning. The candidate should have a strong interest in data and the ability to derive insights from operational information to support customers and size infrastructure.
Technical Skills (Required)
Strong hands-on experience with Linux platforms.
Highly skilled in AWS and/or GCP platforms, Terraform, and core cloud concepts.
Proficiency in automation and scripting (Shell scripting, Python).
Understanding of TCP/IP and HTTP protocols and ability to debug network issues.
Administration and subject matter expertise in Splunk (preferred), or other Observability tools such as DataDog, Dynatrace, or New Relic.
Ability to apply statistical and data analysis techniques to operational metrics in Splunk for advanced analytics and actionable insights.
Technical Skills (Nice To Have)
Basic knowledge of Oracle and SQL.
Experience with DevOps tools (Jenkins, Ansible, orchestration tools).
PowerShell scripting is a plus.
Experience with AppDynamics is preferred but not required.
Familiarity with Python libraries such as NumPy, SciPy, Pandas, scikit-learn, and statsmodels is a plus.
Experience using AI models or Machine Learning in the context of AIOps is desirable.
Familiarity with ITIL and Change Management (ServiceNow) is a plus.
Operational Responsibilities (Required)
Ensure operational readiness and reliability of systems and services.
Administer and provide expertise in observability tools.
Develop and automate reliability-focused solutions, including SLOs/SLIs.
Lead postmortem and root cause analysis (RCA) investigations.
Participate in on-call rotations and support a 24x7 global environment.
Plan capacity, tune performance, and optimize costs.
Provide technical and product support for hosted solutions, including troubleshooting and debugging.
Manage day-to-day monitoring and automation tasks.
Support planning, implementation, deployment, and measurement of operational assets.
Communication & Leadership
Effectively communicate project status, incidents, problems, and root causes to stakeholders and management.
Lead and manage complex challenges at scale.
You must have intellectual curiosity, problem solving, and openness skills.
You will provide support and mentorship to team members.
Qualifications
A minimum of 2+ years of professional experience in SRE, IT Operations, or DevOps.
A Bachelor’s degree in computer science is preferred but not required.
Benefits
Very competitive compensation
Generous Paid Time Off (25 PTO days)
4 days (one day/quarter) Volunteer Time Off (VTO)
5 days off annually for Year-End Break
Comprehensive medical, dental, and Wellness Program
12 weeks paid parental leave
Flexible working arrangements
Formal and informal reward, recognition and acknowledgement programs
Engaging employee development events
Seniorities, Employment Type, Job Function, Industries
Seniority level: Mid-Senior level
Employment type: Full-time
Job function: Engineering and Information Technology
Industries: Technology, Information and Internet
Referrals increase your chances of interviewing at Sabre Corporation. Get notified about new Site Reliability Engineer jobs in Southlake, TX.
Locations and salary ranges shown are for example purposes and reflect typical market ranges; actual ranges may vary by location and experience.
#J-18808-Ljbffr
Overview Site Reliability Engineering (SRE) at Hospitality Solutions combines software and systems engineering to build and run large-scale, distributed, fault-tolerant systems. SREs ensure our services—both internal and external—are reliable, performant, and continuously improving. The role emphasizes automation, operational excellence, and a culture of collaboration and learning. The candidate should have a strong interest in data and the ability to derive insights from operational information to support customers and size infrastructure.
Technical Skills (Required)
Strong hands-on experience with Linux platforms.
Highly skilled in AWS and/or GCP platforms, Terraform, and core cloud concepts.
Proficiency in automation and scripting (Shell scripting, Python).
Understanding of TCP/IP and HTTP protocols and ability to debug network issues.
Administration and subject matter expertise in Splunk (preferred), or other Observability tools such as DataDog, Dynatrace, or New Relic.
Ability to apply statistical and data analysis techniques to operational metrics in Splunk for advanced analytics and actionable insights.
Technical Skills (Nice To Have)
Basic knowledge of Oracle and SQL.
Experience with DevOps tools (Jenkins, Ansible, orchestration tools).
PowerShell scripting is a plus.
Experience with AppDynamics is preferred but not required.
Familiarity with Python libraries such as NumPy, SciPy, Pandas, scikit-learn, and statsmodels is a plus.
Experience using AI models or Machine Learning in the context of AIOps is desirable.
Familiarity with ITIL and Change Management (ServiceNow) is a plus.
Operational Responsibilities (Required)
Ensure operational readiness and reliability of systems and services.
Administer and provide expertise in observability tools.
Develop and automate reliability-focused solutions, including SLOs/SLIs.
Lead postmortem and root cause analysis (RCA) investigations.
Participate in on-call rotations and support a 24x7 global environment.
Plan capacity, tune performance, and optimize costs.
Provide technical and product support for hosted solutions, including troubleshooting and debugging.
Manage day-to-day monitoring and automation tasks.
Support planning, implementation, deployment, and measurement of operational assets.
Communication & Leadership
Effectively communicate project status, incidents, problems, and root causes to stakeholders and management.
Lead and manage complex challenges at scale.
You must have intellectual curiosity, problem solving, and openness skills.
You will provide support and mentorship to team members.
Qualifications
A minimum of 2+ years of professional experience in SRE, IT Operations, or DevOps.
A Bachelor’s degree in computer science is preferred but not required.
Benefits
Very competitive compensation
Generous Paid Time Off (25 PTO days)
4 days (one day/quarter) Volunteer Time Off (VTO)
5 days off annually for Year-End Break
Comprehensive medical, dental, and Wellness Program
12 weeks paid parental leave
Flexible working arrangements
Formal and informal reward, recognition and acknowledgement programs
Engaging employee development events
Seniorities, Employment Type, Job Function, Industries
Seniority level: Mid-Senior level
Employment type: Full-time
Job function: Engineering and Information Technology
Industries: Technology, Information and Internet
Referrals increase your chances of interviewing at Sabre Corporation. Get notified about new Site Reliability Engineer jobs in Southlake, TX.
Locations and salary ranges shown are for example purposes and reflect typical market ranges; actual ranges may vary by location and experience.
#J-18808-Ljbffr