Logo
Optomi

Site Reliability Engineer

Optomi, Plano, Texas, us, 75086

Save Job

Optomi, in partnership with a leading technology operations center, is looking for an SRE – Cloud Platform to join their team in Plano, TX.

6 month contract to hire

Onsite in Plano, TX 4x/week

Position Summary:

The SRE – Cloud Platform will be focused on operating and automating scalable, resilient AWS infrastructure. Working with core AWS services such as EKS, Lambda, CloudWAN, ECR, and Systems Manager, this role will drive self-healing automation, observability, and CI/CD pipeline integration. The role embodies SRE best practices to ensure reliability, performance, and operational excellence of cloud-native platforms supporting business-critical applications. This position will collaborate closely with Cloud Platform Development Teams, Production Engineering, and Major Incident Management teams to resolve production issues and improve infrastructure.

What the right candidate will enjoy:

Opportunity to work with cutting-edge AWS technologies.

Collaborative and cross-functional team environment.

Focus on automation, scalability, and operational excellence.

What type of experience does the right candidate have:

Solid understanding of SRE concepts: SLIs, SLOs, error budgets, incident response.

Strong hands-on experience with AWS services such as EKS, Lambda, CloudWAN, and Systems Manager.

Experience with infrastructure-as-code tools like Terraform and CloudFormation.

Proficiency in scripting languages such as Python, Bash, or PowerShell.

Familiarity with DevOps tools like GitHub, Harness, and Dynatrace.

What the responsibilities are of the right candidate:

Build and maintain components required to automate and self-heal AWS infrastructure.

Develop and maintain infrastructure as code (IaC) using Terraform for scalable and repeatable deployments.

Manage container orchestration platforms and related cloud-native services.

Define and measure SLIs/SLOs, error budgets, and drive reliability improvements.

Implement monitoring and observability using Dynatrace and AWS native services like CloudWatch.

Participate in incident management, on-call rotations, and lead blameless postmortems.

Collaborate cross-functionally to embed SRE principles into cloud platform design and operation.

Troubleshoot network issues and manage cloud routing.

Added bonus if you have:

Certifications like AWS Certified DevOps Engineer or AWS Certified Solutions Architect.

Knowledge of integration tools and technologies like MuleSoft, Camel, and message streaming services.

Seniority Level Mid-Senior level

Employment Type Full-time

Job Function Information Technology

Industries IT Services and IT Consulting

#J-18808-Ljbffr