Optomi
Senior Director, Site Reliability and Platform Engineering
Optomi, Tacoma, Washington, us, 98417
Senior Director, Site Reliability and Platform Engineering
Optomi, in partnership with our client, are seeking an experienced Senior Director, Site Reliability and Platform Engineering, to join their team on a direct hire basis, hybrid 2x/week in Tacoma, WA.
We’re seeking a visionary
Senior Director of Site Reliability and Platform Engineering
to lead and scale our global SRE and DevOps organization. This is a pivotal opportunity to shape reliability strategy, drive operational excellence, and ensure our products perform seamlessly at cloud scale.
In this executive-facing role, you’ll collaborate closely with senior engineering and product leaders to build a culture of reliability, scalability, and continuous improvement—laying the foundation for the next phase of growth.
What You’ll Do
Lead, mentor, and scale a global team of reliability and platform engineers, fostering a culture of operational excellence and innovation.
Define and execute a best-in-class SRE charter, ensuring observability, capacity planning, disaster recovery, and incident management to achieve 99.999% availability.
Manage and optimize large-scale, multi-cloud infrastructure (AWS, GCP, Azure) using a mix of native, open-source, and commercial tools.
Automate and measure reliability through metrics that enhance visibility into product performance, maturity, and risk.
Partner cross-functionally with Engineering, Product Management, IT, and Security to establish and meet SLIs/SLOs that drive customer success.
Implement ongoing reliability training programs across engineering teams to reduce risk and improve response readiness.
Report key metrics and insights to leadership, highlighting progress, risks, and opportunities aligned to business strategy.
What You Bring
15+ years of experience in Site Reliability, Platform Engineering, or related disciplines, with 5+ years in a senior leadership (Director or above) role.
Deep expertise in cloud infrastructure (AWS, GCP, Azure), containers, and Kubernetes at enterprise scale.
Proven success managing large, distributed engineering teams (100+).
Strong hands‑on experience with CI/CD, monitoring, incident management, and observability tools (e.g., Prometheus, Grafana).
Proficiency in automation and scripting (Python, Go, etc.).
Experience maintaining compliance frameworks such as SOC2, FedRAMP, or ISO 27001.
Exceptional leadership, communication, and stakeholder management skills.
Strong business and analytical acumen—able to connect reliability initiatives to strategic and financial outcomes.
Seniority level Director
Employment type Full-time
Job function Information Technology
Industries IT Services and IT Consulting
Medical insurance
Vision insurance
Referrals increase your chances of interviewing at Optomi by 2x
#J-18808-Ljbffr
We’re seeking a visionary
Senior Director of Site Reliability and Platform Engineering
to lead and scale our global SRE and DevOps organization. This is a pivotal opportunity to shape reliability strategy, drive operational excellence, and ensure our products perform seamlessly at cloud scale.
In this executive-facing role, you’ll collaborate closely with senior engineering and product leaders to build a culture of reliability, scalability, and continuous improvement—laying the foundation for the next phase of growth.
What You’ll Do
Lead, mentor, and scale a global team of reliability and platform engineers, fostering a culture of operational excellence and innovation.
Define and execute a best-in-class SRE charter, ensuring observability, capacity planning, disaster recovery, and incident management to achieve 99.999% availability.
Manage and optimize large-scale, multi-cloud infrastructure (AWS, GCP, Azure) using a mix of native, open-source, and commercial tools.
Automate and measure reliability through metrics that enhance visibility into product performance, maturity, and risk.
Partner cross-functionally with Engineering, Product Management, IT, and Security to establish and meet SLIs/SLOs that drive customer success.
Implement ongoing reliability training programs across engineering teams to reduce risk and improve response readiness.
Report key metrics and insights to leadership, highlighting progress, risks, and opportunities aligned to business strategy.
What You Bring
15+ years of experience in Site Reliability, Platform Engineering, or related disciplines, with 5+ years in a senior leadership (Director or above) role.
Deep expertise in cloud infrastructure (AWS, GCP, Azure), containers, and Kubernetes at enterprise scale.
Proven success managing large, distributed engineering teams (100+).
Strong hands‑on experience with CI/CD, monitoring, incident management, and observability tools (e.g., Prometheus, Grafana).
Proficiency in automation and scripting (Python, Go, etc.).
Experience maintaining compliance frameworks such as SOC2, FedRAMP, or ISO 27001.
Exceptional leadership, communication, and stakeholder management skills.
Strong business and analytical acumen—able to connect reliability initiatives to strategic and financial outcomes.
Seniority level Director
Employment type Full-time
Job function Information Technology
Industries IT Services and IT Consulting
Medical insurance
Vision insurance
Referrals increase your chances of interviewing at Optomi by 2x
#J-18808-Ljbffr