Member of Technical Staff - GPU Infrastructure
Join to apply for the Member of Technical Staff - GPU Infrastructure role at Prime Intellect
Member of Technical Staff - GPU Infrastructure
1 day ago Be among the first 25 applicants
Join to apply for the Member of Technical Staff - GPU Infrastructure role at Prime Intellect
Building the Future of Decentralized AI Development
At Prime Intellect, we're enabling the next generation of AI breakthroughs by helping our customers deploy and optimize massive GPU clusters. As our Solutions Architect for GPU Infrastructure, you'll be the technical expert who transforms customer requirements into production-ready systems capable of training the world's most advanced AI models.
We recently raised $15mm in funding (total of $20mm raised) led by Founders Fund, with participation from Menlo Ventures and prominent angels including Andrej Karpathy (Eureka AI, Tesla, OpenAI), Tri Dao (Chief Scientific Officer of Together AI), Dylan Patel (SemiAnalysis), Clem Delangue (Huggingface), Emad Mostaque (Stability AI) and many others.
Core Technical Responsibilities
This customer-facing role combines deep technical expertise with hands-on implementation. You'll be instrumental in:
Customer Architecture & Design
- Partner with clients to understand workload requirements and design optimal GPU cluster architectures
- Create technical proposals and capacity planning for clusters ranging from 100 to 10,000+ GPUs
- Develop deployment strategies for LLM training, inference, and HPC workloads
- Present architectural recommendations to technical and executive stakeholders
- Deploy and configure orchestration systems including SLURM and Kubernetes for distributed workloads
- Implement high-performance networking with InfiniBand, RoCE, and NVLink interconnects
- Optimize GPU utilization, memory management, and inter-node communication
- Configure parallel filesystems (Lustre, BeeGFS, GPFS) for optimal I/O performance
- Tune system performance from kernel parameters to CUDA configurations
- Serve as primary technical escalation point for customer infrastructure issues
- Diagnose and resolve complex problems across the full stack - hardware, drivers, networking, and software
- Implement monitoring, alerting, and automated remediation systems
- Provide 24/7 on-call support for critical customer deployments
- Create runbooks and documentation for customer operations teams
Required Experience
- 3+ years hands-on experience with GPU clusters and HPC environments
- Deep expertise with SLURM and Kubernetes in production GPU settings
- Proven experience with InfiniBand configuration and troubleshooting
- Strong understanding of NVIDIA GPU architecture, CUDA ecosystem, and driver stack
- Experience with infrastructure automation tools (Ansible, Terraform)
- Proficiency in Python, Bash, and systems programming
- Track record of customer-facing technical leadership
- NVIDIA driver installation and troubleshooting (CUDA, Fabric Manager, DCGM)
- Container runtime configuration for GPUs (Docker, Containerd, Enroot)
- Linux kernel tuning and performance optimization
- Network topology design for AI workloads
- Power and cooling requirements for high-density GPU deployments
- Experience with 1000+ GPU deployments
- NVIDIA DGX, HGX, or SuperPOD certification
- Distributed training frameworks (PyTorch FSDP, DeepSpeed, Megatron-LM)
- ML framework optimization and profiling
- Experience with AMD MI300 or Intel Gaudi accelerators
- Contributions to open-source HPC/AI infrastructure projects
You'll work directly with customers pushing the boundaries of AI, from startups training foundation models to enterprises deploying massive inference infrastructure. You'll collaborate with our world-class engineering team while having direct impact on systems powering the next generation of AI breakthroughs.
We value expertise and customer obsession - if you're passionate about building reliable, high-performance GPU infrastructure and have a track record of successful large-scale deployments, we want to talk to you.
Apply now and join us in our mission to democratize access to planetary scale computing.
Seniority level
Seniority level
Mid-Senior level
Employment type
Employment type
Full-time
Job function
Job function
Engineering and Information TechnologyIndustries
Software Development
Referrals increase your chances of interviewing at Prime Intellect by 2x
Get notified about new Member of Technical Staff jobs in San Francisco, CA .
Berkeley, CA $60,000.00-$240,000.00 1 year ago
Member of Technical Staff - Software Engineer
Member of Technical Staff, DevSecOps / Infrastructure
Software Architect - Consulting Member of Technical Staff
San Francisco, CA $105,000.00-$230,000.00 6 hours ago
Software Architect - Consulting Member of Technical Staff
Redwood City, CA $96,800.00-$251,600.00 1 week ago
San Francisco, CA $58,800.00-$109,600.00 1 day ago
San Francisco, CA $136,947.00-$239,699.00 7 months ago
Engineering Manager, Internal Tools, AGI Autonomy
Associate Director of Counseling & Psychological Services - (Administrator II) - Counseling and Psychological Services
San Francisco, CA $10,000.00-$120, months ago
San Francisco, CA $141,800.00-$221,600.00 2 weeks ago
Project Archaeologist/ Cultural Resources Specialist
Oakland, CA $140,000.00-$220,000.00 2 months ago
San Mateo, CA $141,800.00-$221,600.00 2 weeks ago
SENIOR ENVIRONMENTAL SCIENTIST (SPECIALIST)
Alameda, CA $7,556.00-$10,221.00 3 weeks ago
SENIOR ENVIRONMENTAL SCIENTIST (SPECIALIST)
Oakland, CA $7,556.00-$10,221.00 3 weeks ago
San Mateo, CA $90,000.00-$140,000.00 2 weeks ago
Chief Nursing Officer - San Mateo Medical Center
San Mateo County, CA $278,054.40-$347,547.20 2 weeks ago
Division of Gastroenterology - Gastroenterologist
San Francisco, CA $110,500.00-$164,700.00 2 weeks ago
Member of Technical Staff (Student Internship)
San Francisco, CA $6,700.00-$8,300.00 1 month ago
Member of Technical Staff - Compute Platform
San Francisco, CA $100,000.00-$150,000.00 1 month ago
San Francisco, CA $110,000.00-$400,000.00 2 months ago
(New Grad) Member of Technical Staff, Integrations
San Francisco, CA $150,000.00-$300,000.00 2 weeks ago
San Francisco, CA $80.00-$150.00 1 day ago
San Francisco, CA $85,000.00-$100,000.00 3 weeks ago
Quantum Engineer - Member of Technical Staff
San Francisco, CA $120,000.00-$180,000.00 3 months ago
Member of Technical Staff, Founding Design Engineer
San Francisco, CA $130,000.00-$200,000.00 8 months ago
Member of Technical Staff, Founding Frontend Engineer
San Francisco, CA $130,000.00-$200,000.00 7 months ago
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr