Logo
Jobright.ai

Site Reliability Engineer - Inference

Jobright.ai, San Francisco, California, United States, 94199

Save Job

Join to apply for the

Site Reliability Engineer - Inference

role at

Jobright.ai 2 days ago Be among the first 25 applicants Join to apply for the

Site Reliability Engineer - Inference

role at

Jobright.ai Get AI-powered advice on this job and more exclusive features. Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust. Job Summary: Lambda is the #1 GPU Cloud for ML/AI teams, providing tools for building, testing, and deploying AI products at scale. The Site Reliability Engineer - Inference will work on developing a large-scale platform for running AI models and building a high-throughput, low-latency API for distributed systems. Responsibilities: • Work on our Inference service, helping us to develop our large-scale platform for running new, cutting-edge models across tens of thousands of GPUs • Help build a high-throughput, low-latency API and routing system running at geographically-distributed scale • Shape a highly reliable distributed system with a focus on reducing operational overhead and deep observability and capacity management. • Work with the team and our internal ML researchers to adopt and improve new inference engines, models and architectures across a variety of different mediums (such as text, image, video and audio) • Tackle global networking challenges to deliver the lowest possible latency to our users across all of Lambda’s available capacity • Help push Lambda forward into the state of the art, and be part of a team that is operating right at the edge of new developments in the industry. Qualifications: Required: • 8 or more years of experience as a software reliability engineer or software engineer working on large-scale, internet-facing production services • Highly skilled at writing Go and Python • Experience with bare-metal system installation and administration • Experience deploying applications and operators on Kubernetes • Product-focused, balancing operational needs and keeping overheads down with the need to ship features at a rapid pace • Proven track record of working in an environment with rapid deployment and the ability to stay on top of shifting priorities as the industry rapidly develops • Willingness to take ownership of projects and help drive them forwards through design, implementation, launch, and maintenance. Preferred: • Experience working with machine learning models • Experience operating large-scale, geographically distributed systems • Experience developing Kubernetes operators and components Company: Lambda provides infrastructure, cloud services, and software for the training and inferencing of AI models. Founded in 2012, headquartered in San Jose, California, USA, team size 201-500 employees, currently Late Stage. Lambda has a track record of offering H1B sponsorships. Seniority level

Seniority level Mid-Senior level Employment type

Employment type Full-time Job function

Industries Software Development Referrals increase your chances of interviewing at Jobright.ai by 2x Inferred from the description for this job

Medical insurance Vision insurance 401(k) Get notified when a new job is posted. Sign in to set job alerts for “Site Reliability Engineer” roles.

San Francisco, CA $160,000.00-$180,000.00 4 days ago Software Engineer, Infrastructure, Early Career

San Francisco, CA $126,000.00-$170,000.00 11 hours ago San Francisco, CA $180,000.00-$280,000.00 3 days ago San Francisco, CA $130,000.00-$238,000.00 1 day ago San Francisco, CA $150,000.00-$250,000.00 1 day ago San Francisco, CA $150,000.00-$230,000.00 4 months ago San Francisco, CA $99,500.00-$200,000.00 2 weeks ago Full-Stack Software Engineer (Jr/Mid level)

San Francisco, CA $120,000.00-$180,000.00 1 day ago San Francisco, CA $56.25-$137,000.00 5 days ago Software Development Engineer I - Frontend & Mobile

San Francisco, CA $99,500.00-$200,000.00 3 weeks ago San Francisco, CA $160,000.00-$200,000.00 2 months ago San Francisco, CA $150,000.00-$176,000.00 3 months ago San Francisco, CA $120,000.00-$190,000.00 9 months ago San Francisco, CA $130,000.00-$140,000.00 2 weeks ago Software Engineer, AI Intern (Summer 2026)

San Francisco, CA $125,000.00-$175,000.00 2 months ago Software Engineer, AI Intern (Winter 2026)

San Francisco, CA $130,000.00-$240,000.00 2 weeks ago San Francisco, CA $163,200.00-$223,200.00 3 days ago Software Engineer, Frontend (All Levels)

San Francisco, CA $150,000.00-$220,000.00 2 weeks ago San Francisco, CA $150,000.00-$283,000.00 4 days ago San Francisco, CA $155,000.00-$339,500.00 2 weeks ago San Francisco, CA $140,000.00-$280,000.00 8 months ago San Francisco, CA $165,000.00-$165,000.00 2 years ago San Francisco, CA $120,000.00-$200,000.00 2 years ago We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr