Lambda
Forward Deployed Engineer (Site Reliability / Infrastructure)
Lambda, Seattle, Washington, us, 98127
Forward Deployed Engineer (Site Reliability / Infrastructure)
Join us as a Forward Deployed Engineer at Lambda, a leader in AI cloud infrastructure serving thousands of customers.
Base Pay Range $240,000.00/yr – $425,000.00/yr
About the Role We’re looking for a Forward Deployed Engineer to embed directly with a strategic customer, serving as the technical bridge between Lambda and their team. You’ll work where model performance matters most, delivery timelines are urgent, and ambiguity is the default state. Your job is to map problems, structure delivery paths, and ship solutions that create measurable impact.
What You’ll Do
Embed on-site with a named strategic customer, becoming an extension of their team
Act as the primary technical liaison between Lambda and the customer organization
Navigate ambiguous requirements to identify root problems and define clear technical solutions
Scope, sequence, and build full-stack solutions that deliver measurable business value
Design and implement infrastructure optimizations for AI/ML workloads at scale
Debug complex distributed systems issues across the infrastructure stack
Ship iteratively and learn fast, adjusting approach based on customer feedback and results
Identify reusable patterns from customer engagements that can scale across Lambda's customer base
Surface field intelligence that influences Lambda's product roadmap
Document and share learnings to elevate the capabilities of the broader team
Represent Lambda with executive presence in high‑stakes customer interactions
Location & Work Arrangement This position requires presence in our upcoming Bellevue office location or on‑site with strategic customers 4 days per week. Lambda’s designated work from home day is currently Tuesday.
About You
6+ years of experience in SRE, software engineering, or a similar role, with deep knowledge of running Linux clusters and systems
Strong programming skills in Go and Python; experience with GitOps (e.g., ArgoCD), Helm, and Kubernetes operators
Proven experience operating Kubernetes clusters in production environments (on‑prem, EKS, GKE, or similar)
Hands‑on experience with AI/ML workload management tools (Volcano, Kubeflow, or similar)
Familiarity with observability tools like Prometheus, Grafana, FluentBit, and CI/CD pipelines
Proven experience provisioning Kubernetes using tools such as kubeadm, Cluster API, or similar
Excellent communication skills with the ability to translate technical complexity for diverse audiences
Executive presence and ability to represent Lambda in customer‑facing situations
Comfort operating in ambiguous environments with competing priorities
Strong bias for action and shipping iteratively
Nice to Have
Deep Kubernetes expertise: CRDs, CSI, CNI, Kubernetes Operator coding experience
Exposure to HPC clusters, AI/ML workloads, or large‑scale GPU clusters
Hybrid or multi‑cloud Kubernetes environment experience
Contributions to CNCF projects or Kubernetes SIGs
Why Join Us
Work on cutting‑edge managed Kubernetes platforms for AI/ML workloads
Influence the platform roadmap and help shape operations and reliability best practices
Collaborate with a highly skilled engineering team
Opportunity to mentor and grow within a fast‑growing, technology‑driven environment
Salary Range Information The annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.
Equal Opportunity Employer Lambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation, identity, genetic information, veteran status, citizenship, or any other factors prohibited by law.
#J-18808-Ljbffr
Base Pay Range $240,000.00/yr – $425,000.00/yr
About the Role We’re looking for a Forward Deployed Engineer to embed directly with a strategic customer, serving as the technical bridge between Lambda and their team. You’ll work where model performance matters most, delivery timelines are urgent, and ambiguity is the default state. Your job is to map problems, structure delivery paths, and ship solutions that create measurable impact.
What You’ll Do
Embed on-site with a named strategic customer, becoming an extension of their team
Act as the primary technical liaison between Lambda and the customer organization
Navigate ambiguous requirements to identify root problems and define clear technical solutions
Scope, sequence, and build full-stack solutions that deliver measurable business value
Design and implement infrastructure optimizations for AI/ML workloads at scale
Debug complex distributed systems issues across the infrastructure stack
Ship iteratively and learn fast, adjusting approach based on customer feedback and results
Identify reusable patterns from customer engagements that can scale across Lambda's customer base
Surface field intelligence that influences Lambda's product roadmap
Document and share learnings to elevate the capabilities of the broader team
Represent Lambda with executive presence in high‑stakes customer interactions
Location & Work Arrangement This position requires presence in our upcoming Bellevue office location or on‑site with strategic customers 4 days per week. Lambda’s designated work from home day is currently Tuesday.
About You
6+ years of experience in SRE, software engineering, or a similar role, with deep knowledge of running Linux clusters and systems
Strong programming skills in Go and Python; experience with GitOps (e.g., ArgoCD), Helm, and Kubernetes operators
Proven experience operating Kubernetes clusters in production environments (on‑prem, EKS, GKE, or similar)
Hands‑on experience with AI/ML workload management tools (Volcano, Kubeflow, or similar)
Familiarity with observability tools like Prometheus, Grafana, FluentBit, and CI/CD pipelines
Proven experience provisioning Kubernetes using tools such as kubeadm, Cluster API, or similar
Excellent communication skills with the ability to translate technical complexity for diverse audiences
Executive presence and ability to represent Lambda in customer‑facing situations
Comfort operating in ambiguous environments with competing priorities
Strong bias for action and shipping iteratively
Nice to Have
Deep Kubernetes expertise: CRDs, CSI, CNI, Kubernetes Operator coding experience
Exposure to HPC clusters, AI/ML workloads, or large‑scale GPU clusters
Hybrid or multi‑cloud Kubernetes environment experience
Contributions to CNCF projects or Kubernetes SIGs
Why Join Us
Work on cutting‑edge managed Kubernetes platforms for AI/ML workloads
Influence the platform roadmap and help shape operations and reliability best practices
Collaborate with a highly skilled engineering team
Opportunity to mentor and grow within a fast‑growing, technology‑driven environment
Salary Range Information The annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.
Equal Opportunity Employer Lambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation, identity, genetic information, veteran status, citizenship, or any other factors prohibited by law.
#J-18808-Ljbffr