Logo
Lambda

Engineering Manager, HPC Deployments

Lambda, San Jose, California, United States, 95199

Save Job

Join to apply for the

Engineering Manager, HPC Deployments

role at

Lambda

This role is based in San Francisco, San Jose, Seattle – you will need to be present 4 days per week at one of these offices. Lambda’s designated work‑from‑home day is currently Tuesday.

Base pay range $267,000 /yr – $486,000 /yr

About the role Engineering at Lambda builds and scales our AI Cloud. The HPC Deployments team deploys cutting‑edge NVIDIA GPU clusters on time, at scale, with 100% quality. Reporting to the Director of Fleet Engineering, you will lead and scale a team of HPC engineers to ensure customers receive reliable, high‑performance compute.

What you’ll do

Lead a distributed team of HPC engineers responsible for configuration, validation, and deployment of large‑scale GPU clusters.

Work cross‑functionally with Product and Infrastructure to deliver projects on time, aligning across stakeholders.

Identify efficiency improvements in tools, processes, and automation.

Maintain clear visibility into deployment progress, risks, and outcomes.

Drive outcomes by managing staff allocations, project priorities, deadlines, and deliverables.

Conduct regular one‑on‑one meetings, provide constructive feedback, and support career development for team members.

Stay current on the latest HPC/AI technologies and best practices.

Participate in qualification efforts for new technologies for production deployments.

About you

Extensive experience in HPC or large‑scale infrastructure, 3+ years in a leadership or management role.

Strong problem‑solving and troubleshooting skills.

Excellent communication and collaboration with peer engineering managers.

Comfortable leading and mentoring HPC engineers on cluster deployments.

Experience building a high‑performance team through hiring, upskilling, skill redundancy, performance‑management, and expectation setting.

Flexibility to travel to our North American data centers as on‑site needs arise.

Nice to have

Linux systems administration, automation, scripting/coding.

Containerization technologies (Docker, Kubernetes).

GPU acceleration, virtualization, and cloud computing.

Machine learning and deep learning frameworks (PyTorch, TensorFlow) and benchmarking tools (DeepSpeed, MLPerf).

Customer awareness, diplomacy.

Bachelor’s degree or equivalent experience in a technical field.

Salary range information The annual salary range for this position is $267,000 – $486,000 / yr. A salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed.

About Lambda

Founded in 2012; ~400 employees (2025) and growing fast.

Generous cash & equity compensation.

In‑Q‑T, NVIDIA, and other top investors.

Health, dental, and vision coverage.

Wellness and commuter stipends for select roles.

401k plan with 2% company match (USA employees).

Flexible paid time off plan.

Equal‑opportunity employer Lambda is an Equal‑Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by law.

Seniority level Mid‑Senior level

Employment type Full‑time

Job function Engineering and Information Technology

#J-18808-Ljbffr