Logo
Stride Consulting

Principal Site Reliability Engineer (SRE) Los Angeles, CA or Remote

Stride Consulting, Los Angeles, California, United States, 90079

Save Job

Principal Site Reliability Engineer (SRE)

Los Angeles, CA or Remote Overview

At InStride, people are our purpose. We partner with leading employers to unlock opportunities for their employees by providing access to top-tier education programs that align with their employees’ career goals and the company’s business goals. Our mission is to empower our partners’ employees to advance their careers, elevate their expertise, and achieve meaningful personal and professional growth. If you’re passionate about making a difference and driving educational and professional advancement, InStride is the place for you. To get a better feel for our culture, watch more here. Candidates must be located in one of the following states to be considered eligible for employment: AZ, CA, CO, CT, FL, GA, IL, IN, KS, LA, MD, MA, MI, MO, NV, NH, NJ, NY, PA, OH, OR, TX, VA, WA, WI. What we're looking for

We’re looking for a

Principal Site Reliability Engineer (SRE)

to join InStride’s growing engineering team. This is a highly technical role for an individual contributor who thrives at the intersection of

cloud architecture, automation, and reliability engineering . You will be the go-to AWS expert for complex initiatives, setting technical direction, and raising the bar for operational excellence across our platform. At InStride, every system you design, every automation you implement, and every safeguard you put in place will directly support our mission of expanding access to life-changing education for working adults around the globe.

Skills we’d love to see you show off

Cloud Architecture & Strategy : Design and optimize AWS environments that balance scalability, resilience, and cost efficiency for enterprise workloads.

Technical Leadership & Mentorship : Serve as a trusted technical advisor, guiding engineers on best practices in Kubernetes, DevSecOps, and AWS-native design patterns.

Infrastructure as Code Mastery : Build reusable, version-controlled IaC libraries with AWS CDK, Terraform, or CloudFormation to standardize deployments.

Security & Compliance by Design : Enforce least-privilege IAM, encryption-by-default, and policy-as-code guardrails to meet security and regulatory standards.

Observability & Reliability Engineering : Define SLIs/SLOs, manage error budgets, and implement monitoring strategies with Prometheus, Grafana, and AWS-native tools.

CI/CD Excellence : Optimize automated pipelines with Harness and GitHub, enabling faster, safer, and more reliable software delivery.

Networking & Resilience : Architect secure, performant VPCs, load balancing, and multi-region failover strategies with AWS networking services.

Automation & Self-Service Enablement : Deliver developer-friendly automation and Internal Developer Portal (IDP) capabilities that empower teams to provision infrastructure without SRE intervention.

Who you are

10+ years of experience in SRE, DevOps, or Platform Engineering roles operating production AWS workloads.

Hands-on expertise with

AWS EKS, Kubernetes networking, Helm, autoscaling frameworks (Karpenter/Cluster Autoscaler), serverless architectures, and API Gateways .

Proven delivery of

service mesh solutions

(Istio, Linkerd, or AWS App Mesh) for secure and observable service-to-service communication.

Proficiency with

Infrastructure as Code (IaC)

using AWS CDK (TypeScript preferred/Python), Terraform, or CloudFormation.

Strong programming and automation skills in

Go, Python, or TypeScript , with additional proficiency in Bash.

Demonstrated experience implementing

policy-as-code

with OPA/Rego or similar tooling integrated into CI/CD pipelines.

Solid understanding of

SLI/SLO/error-budget methodologies

and hands-on experience with monitoring and alerting stacks (Prometheus, Grafana, CloudWatch, Groundcover).

Deep knowledge of

AWS security best practices , including IAM policies, encryption, OS hardening, and compliance enforcement.

Excellent communication skills with the ability to

translate reliability metrics into business impact

and guide incident/post-mortem discussions.

Experience mentoring engineers and influencing

enterprise AWS and DevOps strategies

without direct management responsibilities.

Familiarity with

Internal Developer Portals

(Backstage, Port, Cortex) and self-service automation is a strong plus.

How you will create impact

Elevate platform reliability : Design and operate multi-region, fault-tolerant systems that ensure InStride’s learning platform is always available for learners and partners.

Advance automation at scale : Deliver Infrastructure as Code libraries, CI/CD pipelines, and self-service capabilities that reduce operational toil and accelerate developer productivity.

Champion security and compliance : Implement defense-in-depth strategies, policy-as-code guardrails, and proactive monitoring to protect sensitive data and maintain trust.

Drive observability maturity : Define and enforce SLIs/SLOs, establish error-budget policies, and build monitoring frameworks that inform release readiness and operational decisions.

Enable seamless service connectivity : Deploy and manage service mesh solutions that secure, monitor, and optimize service-to-service communication across Kubernetes workloads.

Influence technical direction : Partner with engineering and security stakeholders to shape InStride’s AWS strategy, ensuring scalability, resilience, and cost efficiency.

Mentor and uplift engineers : Share expertise, lead design reviews, and guide teams toward modern DevOps and SRE practices, raising the technical bar across the organization.

Compensation At InStride, final offer amounts are dependent on multiple factors including location, depth of experience, interview performance and equity with other team members. We encourage you to talk with your recruiter to learn more about the total compensation and benefits available for this role. Compensation range:

$165,000 - $185,000 USD

We are looking for someone who is not only technically skilled, but also enthusiastic about making a meaningful impact. If this description resonates with you, we're excited about the possibility of having you on our team. As a skills-driven employer, we encourage you to apply if there is a skill-fit, even in the absence of years of experience.

Don’t meet every single requirement?

Studies have shown that women and people of color are less likely to apply to jobs unless they meet every single qualification. At InStride, we are dedicated to building a diverse, inclusive, and authentic workplace, so if you’re excited about this role, but your past experience doesn’t align perfectly with every qualification in the job description, we encourage you to apply anyways. You may be just the right candidate for this role!

Benefits @ InStride

InStride employees are eligible to enroll in 2,800+ online certificate and degree programs through our Step Forward program. Unlike traditional tuition reimbursement programs, InStride covers tuition upfront, regardless of course of study, degree type, or school - eligible to employees starting Day 1. 401(k) plan with company match Flexible vacation policy Paid family leave And more! Diversity and Inclusion

At InStride, we foster a culture of belonging, support authenticity and intersectionality, and embrace our differences. We are committed to building a diverse pipeline of talent and ensuring equitable access to opportunities. If you have a disability or special need that requires accommodation, please let your recruiter know. InStride recommends employees have their COVID vaccinations. We may require vaccinations in the future, but not at this time. For questions on how we use personal information of job applicants, please refer to InStride's Job Applicant Privacy Policy. About InStride

InStride helps organizations retain talent and upskill employees through education programs, delivering lasting impact. Visit instride.com for more information.

#J-18808-Ljbffr