Site Reliability Engineer
Agile Fuel | World-class Dedicated Engineering Teams, Mountain View
Join to apply for the Site Reliability Engineer role at Agile Fuel | World-class Dedicated Engineering Teams
Join to apply for the Site Reliability Engineer role at Agile Fuel | World-class Dedicated Engineering Teams
Get AI-powered advice on this job and more exclusive features.
Our client is a leading WebOps platform that powers the open web by hosting high-performance websites in the cloud for global organizations such as Stitch Fix, Okta, Home Depot, Pernod Ricard, and The Barack Obama Foundation. Every day, thousands of developers and marketers use the platform to build, iterate, and scale websites that reach billions of users worldwide.
This SaaS-based solution helps web and digital teams of all sizes improve performance through powerful tools for site management, governance, security, and collaboration. With built-in support for developing, testing, deploying, and running websites — all with best-in-class speed, scalability, and uptime — the platform enables teams to succeed in a fast-paced digital environment.
With over 35% of the web powered by open-source technologies and a $200B+ total addressable market, the company is scaling rapidly and expanding its world-class team. Headquartered in San Francisco, it is the trusted solution for managing high-value WordPress and Drupal websites. Their greatest strength lies in the creativity, passion, and collaboration of their people.
Position Overview
We are looking for a Site Reliability Engineer (SRE) to join their engineering team. The SRE will help scale and support a platform that powers hundreds of thousands of websites, runs millions of containers, and serves billions of page views each month.
The role involves maintaining and evolving a Kubernetes-based, cloud-native infrastructure, custom CI/CD pipelines, distributed file systems, and internal tooling designed to manage containers at scale. The company is an active contributor to open-source communities such as WordPress, Drupal, Fedora, Chef, systemd, cURL, Kubernetes, Terraform, and Sensu.
Responsibilities
- Architect and implement global-scale systems using cutting-edge tools on Google Cloud Platform;
- Improve the reliability and scalability of the Pantheon platform using technologies such as Kubernetes, Prometheus, Go, and Terraform;
- Collaborate with engineering teams to help define and achieve Service Level Objectives (SLOs);
- Maintain and enhance infrastructure components, including observability, monitoring, and Kubernetes management;
- Drive continuous improvements in engineering practices and standards for testing, deployment, and development workflows;
- Participate in an on-call rotation to support platform stability and performance.
- Site Reliability Engineering (SRE) practices;
- Kubernetes – orchestration and infrastructure management;
- Prometheus – monitoring and alerting;
- Google Cloud Platform (GCP) – cloud infrastructure;
- Go (Golang) – programming language;
- Python, Ruby, Bash – scripting and automation;
- Terraform – infrastructure as code (IaC);
- CI/CD pipelines – design and maintenance (GH Actions, CircleCI);
- Monitoring & Metrics Systems – design and implementation;
- Observability Engineering – instrumentation, logging, tracing;
- Distributed Systems – design and maintenance;
- Multi-tenant Architecture – platform and resource isolation;
- Containers & Container Management at Scale – Docker and orchestration;
- Service Level Objectives (SLOs) – definition and enforcement;
- Automation of Infrastructure Tasks;
- Version Control Systems – Git;
- Incident Management / On-call Operations;
- Security & Governance in SaaS Environments;
- Open-source Contributions (optional) – familiarity with communities like WordPress, Drupal, Chef, systemd, etc;
- Linux Systems – administration and troubleshooting.
- Proven experience working with high-traffic, large-scale platforms in production environments;
- Deep interest in monitoring, metrics, and SRE principles such as SLOs and error budgets;
- Strong preference for automation over manual processes ("toil");
- Proficiency in one or more programming languages such as Go, Python, Ruby (optional);
- Excellent English communication skills, with the ability to convey complex ideas clearly and collaborate effectively across teams;
- Team-oriented mindset and pride in contributing to shared success.
We offer excellent benefits, including but not limited to
- People-oriented management without bureaucracy;
- Competitive compensation;
- Flexible schedule;
- 20 working days of annual paid vacation;
- Paid sick leaves;
- Friendly and engaging professional team;
- Opportunities for self-realization, career, and professional growth;
- Accounting and legal support.
Seniority level
Seniority level
Mid-Senior level
Employment type
Employment type
Full-time
Job function
Job function
Engineering and Information TechnologyIndustries
IT Services and IT Consulting
Referrals increase your chances of interviewing at Agile Fuel | World-class Dedicated Engineering Teams by 2x
Get notified about new Site Reliability Engineer jobs in Mountain View, CA .
Sunnyvale, CA $141,000.00-$202,000.00 2 weeks ago
Sunnyvale, CA $141,000.00-$202,000.00 2 weeks ago
Foster City, CA $120,000.00-$160,000.00 1 week ago
Sunnyvale, CA $141,000.00-$202,000.00 1 week ago
Staff Software Engineer, Adversarial ML, Core
Sunnyvale, CA $197,000.00-$291,000.00 2 days ago
Fremont, CA $166,000.00-$244,000.00 2 weeks ago
Staff Software Engineer, AI/ML Recommendations, Rankings, Predictions, YouTube
Mountain View, CA $141,000.00-$202,000.00 1 week ago
Sunnyvale, CA $141,000.00-$202,000.00 1 week ago
Software Test Engineer, Pixel Cross-Device Experiences
Mountain View, CA $102,000.00-$146,000.00 1 week ago
Software Engineer III, Performance, Google Maps
Mountain View, CA $141,000.00-$202,000.00 2 days ago
Sunnyvale, CA $117,000.00-$173,000.00 2 weeks ago
Staff Software Engineer, Scalability Regions Efficiency and Capacity
Sunnyvale, CA $197,000.00-$291,000.00 1 week ago
Mountain View, CA $150,000.00-$220,000.00 1 week ago
Menlo Park, CA $117,000.00-$173,000.00 2 weeks ago
Staff Software Engineer, Databases, Google Cloud
Sunnyvale, CA $141,000.00-$202,000.00 2 weeks ago
Sunnyvale, CA $147,000.00-$208,000.00 2 weeks ago
Software Engineer III, Augmented Reality
Mountain View, CA $141,000.00-$202,000.00 2 days ago
Mountain View, CA $141,000.00-$202,000.00 2 weeks ago
Software Engineer III, Mobile Ads Security
Mountain View, CA $141,000.00-$202,000.00 1 week ago
Sunnyvale, CA $141,000.00-$202,000.00 1 week ago
Staff Software Engineer, AI/ML Recommendations, Rankings, Predictions, YouTube
Mountain View, CA $166,000.00-$244,000.00 1 week ago
Software Engineer III, AI/ML GenAI, Google Ads
Mountain View, CA $141,000.00-$202,000.00 1 day ago
Software Engineer III, Full Stack, Google Cloud Business Platforms
Sunnyvale, CA $141,000.00-$202,000.00 1 week ago
Software Engineer, ML Supercomputer Reliability
Sunnyvale, CA $197,000.00-$291,000.00 1 week ago
Senior Software Engineer, AI/ML, Google Cloud Compute
Sunnyvale, CA $166,000.00-$244,000.00 1 week ago
Menlo Park, CA $147,000.00-$208,000.00 2 weeks ago
Software Engineer, Test Automation, Google Distributed Cloud
Sunnyvale, CA $141,000.00-$202,000.00 1 day ago
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr