Prime Intellect
Senior LLM Inference & Serving Engineer
Prime Intellect, San Francisco, California, United States, 94199
A leading AI technology firm in San Francisco is seeking a Member of Technical Staff - Inference to build and optimize large-scale ML services. This hybrid role involves developing LLM serving infrastructure and optimizing inference systems. Ideal candidates will have significant ML service experience, especially with vLLM or SGLang, and a thorough understanding of GPU architecture. The company offers competitive compensation, flexible work arrangements, and growth opportunities in AI development.
#J-18808-Ljbffr