Logo
Jobright.ai

Site Reliability Engineer - Inference

Jobright.ai, San Francisco, California, United States, 94199

Save Job

Join to apply for the

Site Reliability Engineer - Inference

role at

Jobright.ai

The following information aims to provide potential candidates with a better understanding of the requirements for this role. 2 days ago Be among the first 25 applicants Join to apply for the

Site Reliability Engineer - Inference

role at

Jobright.ai Get AI-powered advice on this job and more exclusive features. Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust. Job Summary: Lambda is the #1 GPU Cloud for ML/AI teams, providing tools for building, testing, and deploying AI products at scale. The Site Reliability Engineer - Inference will work on developing a large-scale platform for running AI models and building a high-throughput, low-latency API for distributed systems. Responsibilities: • Work on our Inference service, helping us to develop our large-scale platform for running new, cutting-edge models across tens of thousands of GPUs • Help build a high-throughput, low-latency API and routing system running at geographically-distributed scale • Shape a highly reliable distributed system with a focus on reducing operational overhead and deep observability and capacity management. • Work with the team and our internal ML researchers to adopt and improve new inference engines, models and architectures across a variety of different mediums (such as text, image, video and audio) • Tackle global networking challenges to deliver the lowest possible latency to our users across all of Lambda’s available capacity • Help push Lambda forward into the state of the art, and be part of a team that is operating right at the edge of new developments in the industry. Qualifications: Required: • 8 or more years of experience as a software reliability engineer or software engineer working on large-scale, internet-facing production services • Highly skilled at writing Go and Python • Experience with bare-metal system installation and administration • Experience deploying applications and operators on Kubernetes • Product-focused, balancing operational needs and keeping overheads down with the need to ship features at a rapid pace • Proven track record of working in an environment with rapid deployment and the ability to stay on top of shifting priorities as the industry rapidly develops • Willingness to take ownership of projects and help drive them forwards through design, implementation, launch, and maintenance. Preferred: • Experience working with machine learning models • Experience operating large-scale, geographically distributed systems • Experience developing Kubernetes operators and components Company: Lambda provides infrastructure, cloud services, and software for the training and inferencing of AI models. Founded in 2012, headquartered in San Jose, California, USA, team size 201-500 employees, currently Late Stage. Lambda has a track record of offering H1B sponsorships. Seniority level

Seniority levelMid-Senior level Employment type

Employment typeFull-time Job function

IndustriesSoftware Development Referrals increase your chances of interviewing at Jobright.ai by 2x Inferred from the description for this job Medical insurance Vision insurance 401(k) Get notified when a new job is posted. Sign in to set job alerts for “Site Reliability Engineer” roles. San Francisco, CA $160,000.00-$180,000.00 4 days ago Software Engineer, Infrastructure, Early Career San Francisco, CA $126,000.00-$170,000.00 11 hours ago San Francisco, CA $180,000.00-$280,000.00 3 days ago San Francisco, CA $130,000.00-$238,000.00 1 day ago San Francisco, CA $150,000.00-$250,000.00 1 day ago San Francisco, CA $150,000.00-$230,000.00 4 months ago San Francisco, CA $99,500.00-$200,000.00 2 weeks ago Full-Stack Software Engineer (Jr/Mid level) San Francisco, CA $120,000.00-$180,000.00 1 day ago San Francisco, CA $56.25-$137,000.00 5 days ago Software Development Engineer I - Frontend & Mobile San Francisco, CA $99,500.00-$200,000.00 3 weeks ago San Francisco, CA $160,000.00-$200,000.00 2 months ago San Francisco, CA $150,000.00-$176,000.00 3 months ago San Francisco, CA $120,000.00-$190,000.00 9 months ago San Francisco, CA $130,000.00-$140,000.00 2 weeks ago Software Engineer, AI Intern (Summer 2026) San Francisco, CA $125,000.00-$175,000.00 2 months ago Software Engineer, AI Intern (Winter 2026) San Francisco, CA $130,000.00-$240,000.00 2 weeks ago San Francisco, CA $163,200.00-$223,200.00 3 days ago Software Engineer, Frontend (All Levels) San Francisco, CA $150,000.00-$220,000.00 2 weeks ago San Francisco, CA $150,000.00-$283,000.00 4 days ago San Francisco, CA $155,000.00-$339,500.00 2 weeks ago San Francisco, CA $140,000.00-$280,000.00 8 months ago San Francisco, CA $165,000.00-$165,000.00 2 years ago San Francisco, CA $120,000.00-$200,000.00 2 years ago We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr