Prime Intellect

Senior LLM Inference & Serving Engineer

Prime Intellect, San Francisco, California, United States, 94199

A leading AI technology firm in San Francisco is seeking a Member of Technical Staff - Inference to build and optimize large-scale ML services. This hybrid role involves developing LLM serving infrastructure and optimizing inference systems. Ideal candidates will have significant ML service experience, especially with vLLM or SGLang, and a thorough understanding of GPU architecture. The company offers competitive compensation, flexible work arrangements, and growth opportunities in AI development. #J-18808-Ljbffr