Logo
Lambda

Storage Protocols Engineering Manager

Lambda, San Francisco, California, United States, 94199

Save Job

Overview

We are seeking an experienced Software Engineering Manager with a history in the development of storage protocols and distributed storage systems to lead a team of Storage Software Engineers and Distributed Systems Engineers in the design, development, and optimization of cutting-edge distributed storage solutions for AI training and inference workloads. What You’ll Do

Team Leadership & Management:

Grow, hire, lead, and mentor a top-talent team of software engineers focused on distributed storage protocols. Foster a high-velocity culture of innovation, technical excellence, and collaboration. Conduct regular one-on-one meetings, provide feedback, and support career development. Drive outcomes by managing project priorities, deadlines, and deliverables using Agile methodologies.

Technical Strategy & Execution:

Drive the technical vision for distributed storage protocols (e.g., S3, NFS, iSCSI) and underlying systems. Oversee development of high-performance storage solutions for AI workloads (high throughput, low latency, AI access patterns). Lead complex distributed systems challenges including concurrency, fault tolerance, and data durability across data centers. Guide engineering RFCs, requirements, and stakeholder alignment. Identify performance bottlenecks and steer innovative solutions. Lead the team in supporting customers.

Cross-Functional Collaboration:

Work with AI/ML research and product teams to translate storage needs into technical requirements. Collaborate with product engineering to deliver high-quality products; partner with PM to define roadmaps. Coordinate with HPC Architecture, Networking, Compute, and Storage to deploy storage protocols for AI workloads; align with fleet, platforms, and Storage Engineering for reliable products.

Innovation & Research:

Stay current with trends in distributed systems and storage technologies; explore new technologies to improve performance. Collaborate with Lambda product teams on AI inference/training trends.

Qualifications

Experience:

10+ years of software development; 5+ years in a leadership role in storage software engineering; proven track record leading complex, cross-functional projects in fast-paced environments; extensive hands-on experience in distributed storage systems; experience with storage protocols serving volumes at scales > 20PB; expertise in Kubernetes or container orchestration. Technical Skills:

Knowledge of object, block, and/or file storage protocols; proficiency in C++, Go, Rust, or Python; familiarity with NVMe, RDMA; experience with Docker/Kubernetes integration; strong OS internals knowledge. Distributed Systems:

Deep understanding of consensus, caching, fault tolerance, data durability, replication, load balancing. People Management:

Experience building high-performance teams, hiring, upskilling, and performance management. Nice to Have

Experience delivering distributed storage protocols in CSP/HPC/AI-infrastructure contexts; scales > 100PB; cross-functional engineering management initiatives; domain knowledge in AI/ML frameworks. Technical Skills:

Deep expertise in storage protocols; strong programming in C++, Go, Rust, or Python; OS internals knowledge. AI/ML Domain:

Experience with TensorFlow or PyTorch; understanding AI data access patterns and performance needs. Distributed Systems & Leadership:

Ability to design complex concurrent systems; experience training or managing managers. Compensation & Benefits

The base pay range is $330,000 - $495,000 per year. This range is provided by Lambda; actual pay is based on skills and experience. Other compensation details and benefits are described in the posted materials. Employment Details

Employment type: Full-time Seniority level: Mid-Senior level Job function: Engineering and Information Technology Industry: Software Development Equal Opportunity Employer. Lambda is committed to diversity and inclusion in all aspects of hiring. We do not discriminate based on race, color, religion, creed, national origin, age, sex, gender identity, sexual orientation, disability, veteran status, or any other legally protected status.

#J-18808-Ljbffr