Site Reliability Engineer (SRE) - Platform Infrastructure team (100% Remote - US
Hopper, WorkFromHome
About the job
Hopper is looking for a Senior Site Reliability Engineer to join our Platform Infrastructure team — the group that builds and operates the cloud foundation powering products used by millions of travelers worldwide .
Our mission is to empower engineers across Hopper to ship fast, stay resilient, and scale effortlessly. If you care about automation, scalability, and developer experience — and want to make a tangible impact on a growing travel tech company — this could be the perfect role for you.
You’ll help evolve a large‑scale, multi‑region infrastructure running in Google Cloud , supporting hundreds of engineers and dozens of product teams. You’ll contribute to building automated, self‑service platform tools , ensuring the foundation is secure, reliable, cost‑efficient , and easy to use.
What you’ll do
- Improve and evolve platform tooling to support a growing number of services and teams across Hopper.
- Design infrastructure workflows that are simple, consistent, and scalable — enabling engineers to build and deploy with confidence.
- Drive automation across key infrastructure components, reducing manual work and increasing reliability.
- Adapt and scale infrastructure offerings to meet the needs of product teams while maintaining a cohesive and maintainable platform.
- Participate in incident response for platform‑level issues as part of a globally distributed, sustainable on‑call rotation.
- Support engineering teams by troubleshooting platform issues, answering infrastructure‑related questions, and reviewing pull requests that affect core systems.
- Collaborate with a small, high‑impact team of SREs focused on operational excellence, performance, and developer experience.
Ideal candidate
- Professional experience in SRE, DevOps, Software Engineering, or Systems Engineering, with a passion for building reliable, scalable infrastructure.
- Strong troubleshooting and incident response skills across distributed systems and cloud‑native environments.
- Solid system design and analytical thinking, with a focus on simplicity, performance, and maintainability.
- Clear and effective communication skills, with the ability to collaborate across engineering teams.
Technical expertise
- Hands‑on experience with major cloud platforms — ideally Google Cloud Platform (GCP) .
- Deep familiarity with Infrastructure as Code, preferably using Terraform .
- Experience building and operating with containers and Kubernetes, and related tools like Helm or Kustomize .
- Working knowledge of Service Mesh technologies, preferably Istio .
- Solid understanding of networking fundamentals — DNS, TLS, certificates, ingress controllers, etc.
- Knowledge of cloud and infrastructure security best practices, including IAM, RBAC , and network segmentation.
- Familiarity with authentication and authorization protocols.
- Experience with observability stacks — logs, metrics, tracing, and APM (preferably using Datadog ).
- Practical knowledge of CI/CD pipelines and deployment automation.
- Exposure to database technologies, SQL and NoSQL.
- Comfortable writing scripts in Bash , Python , or similar scripting languages.
Benefits and perks
- Competitive salary and pre‑IPO equity package.
- Unlimited PTO.
- Carrot Cash travel stipend.
- Work‑from‑home stipend and co‑working space via FlexDesk.
- Very generous parental leave, above industry standards.
- Entrepreneurial culture, open communication with leadership, small dynamic teams with massive impact.
- 100% employer‑paid medical, dental, and vision coverage.
- Disability & life insurance, Health Reimbursement Account (HRA), 401(k) plan.