Logo
UST

Site Reliability Engineer (SRE)

UST, Atlanta, Georgia, United States, 30383

Save Job

Overview

Join to apply for the

Site Reliability Engineer (SRE)

role at

UST . Role:

Lead II - DevOps Engineering Responsibilities

Design, build, and maintain reliable, scalable, and secure cloud-based infrastructure (AWS, Azure, or GCP). Develop and improve observability using monitoring, logging, and tracing tools (e.g., Prometheus, Grafana, ELK, Datadog, etc.). Automate repetitive tasks and infrastructure using Infrastructure-as-Code (Terraform, CloudFormation, Pulumi). Create and maintain CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, ArgoCD, etc.) to support fast and safe delivery. Lead incident response, root cause analysis, and postmortems to ensure high uptime and rapid recovery. Optimize system performance, reliability, and cost-effectiveness through proactive monitoring and tuning. Collaborate with software engineering teams to define SLAs/SLOs and improve service reliability. Implement and maintain security best practices across environments (e.g., secrets management, IAM, firewalls, etc.). Maintain disaster recovery plans, backups, and high-availability strategies. What You Need

5+ years of experience as an SRE, DevOps Engineer, or similar role. Proficiency in scripting and automation (Bash, Python, Go, etc.). Strong experience with containerization and orchestration (Docker, Kubernetes, Helm). Solid understanding of Linux systems administration and networking fundamentals. Experience with cloud platforms (AWS, Azure, or GCP). Experience with IaC tools like Terraform or CloudFormation. Familiarity with GitOps and modern deployment practices. Hands-on experience with observability tools (e.g., Prometheus, Grafana, Datadog). Strong troubleshooting and incident response skills. Preferred: Experience in a high-traffic, microservices-based architecture. Exposure to service meshes (Istio, Linkerd). Certifications (AWS Certified DevOps Engineer, CKA, etc.). Experience with security automation and compliance (e.g., SOC2, ISO27001). Soft Skills: Strong communication and collaboration abilities. Ability to thrive in a fast-paced, agile environment. Analytical mindset and proactive approach to problem-solving. A passion for automation, performance, and system design. Role Location & Compensation

Role Location:

Georgia Compensation Range:

$90,000-$135,000 Benefits

Full-time, regular employees accrue a minimum of 10 days of paid vacation per year, 6 days of paid sick leave per year (pro-rated for new hires), 10 paid holidays, and eligible for paid bereavement leave and jury duty. They are eligible for the Company’s 401(k) with employer matching. Medical, dental, and vision insurance are available, as well as Company-paid basic life insurance, accidental death and disability coverage, and short- and long-term disability benefits. Options to purchase additional voluntary disability benefits and to participate in an HSA/FSA are available as allowed by IRS guidelines. Benefits vary by location. Part-time and full-time temporary employees have different leave and 401(k) provisions. All US employees in locations with more generous paid sick leave laws receive those benefits where applicable. What We Believe

We proudly embrace the values that have shaped UST since day one: Humility, Humanity, and Integrity. These values foster a people-first, human-centric culture that prioritizes sustainable solutions and keeps people and clients at the forefront of decisions. Equal Employment Opportunity Statement

UST is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other applicable characteristics protected by law. We will consider qualified applicants with arrest or conviction records in accordance with state and local laws and “fair chance” ordinances. UST reserves the right to redefine roles and responsibilities based on organizational requirements and performance. #UST #CB Skills: Reliability Engineering, Kubernetes, Cloud Platform, Python Scripting

#J-18808-Ljbffr