Logo
Optomi

Senior Director, Site Reliability and Platform Engineering

Optomi, Austin, Texas, us, 78716

Save Job

Senior Director, Site Reliability and Platform Engineering Base salary: $275,000–$295,000 per year.

Location: Tacoma, WA.

Optomi, in Partnership with a leader in the technology industry, is seeking an experienced Senior Director of Site Reliability and Platform Engineering for their Tacoma, WA location. The right candidate will have led large teams of ideally up to 100 people or more, and have experience in AWS (and GCP / Azure preferably), as well as DevOps practices, and Kubernetes/ CI/CD.

Responsibilities

Lead and mentor a team of reliability and platform engineers, championing a culture of reliability, scalability, and continuous improvement across all customer products, both on-prem and SaaS.

Establish a charter for best-in-class site reliability engineering, and drive engineering teams toward achieving these best practices.

Institute a set of tools and processes that ensure monitoring, observability, capacity planning, disaster recovery, and incident management systems can support 99.999 availability for critical services.

Manage large-scale infrastructure and applications across multiple cloud providers using a mix of native cloud, open-source, and commercial off-the-shelf tools.

Work with stakeholders, including engineering, IT, product management, and customer support, to define and ensure customer-driven SLIs/SLOs exist for both new and existing functionality.

Communicate progress by highlighting the accomplishments, risks, mitigation, and other pertinent key performance indicators that feed into the overarching business strategy.

Requirements

15+ years of experience in SRE, platform engineering, or related roles with at least 5 years of this time in a director-level role.

10+ years of experience with cloud infrastructure, such as AWS, GCP, and Azure, and DevOps practices.

Proven experience managing large-scale, high-availability systems with an emphasis on containers and Kubernetes environments.

Experience with CI/CD pipelines, monitoring tools, and incident management processes.

Experience automation and scripting like Python and Go and experience with monitoring and observability tools, such as Prometheus, Grafana, etc.

Experience maintaining SOC2, FedRAMP, or ISO 27001 certifications.

Experience working within a global team structure.

Excellent leadership, communication, and interpersonal skills.

Seniority level Director

Employment type Full-time

Job function Information Technology

Referrals increase your chances of interviewing at Optomi by 2x.

#J-18808-Ljbffr