Qualcomm
LLM Serving Engineer (Cloud AI Engineering), Staff Engineer
Qualcomm, San Diego, California, United States, 92189
Company:
Qualcomm Technologies, Inc.
Job Area:
Engineering Group, Engineering Group > Machine Learning Engineering
Overview
LLM Serving Engineer (Cloud AI Engineering) Qualcomm is utilizing its traditional strengths in digital wireless technologies to play a central role in the evolution of Cloud AI. We are investing in several supporting technologies including Deep Learning. The Qualcomm Cloud AI team is developing hardware and software solutions for Inference Acceleration. We are hiring LLM Serving Engineers at multiple levels to join our dynamic, collaborative team. This role spans the full product lifecycle—from cutting-edge research and development to commercial deployment—and demands strategic thinking, strong execution, and excellent communication skills. Responsibilities
Building a scalable LLM inference platform using inference techniques (e.g., disaggregated serving and KV-Cache management, advanced parallelism, speculative algorithms, model optimization, specialized kernels). Contributing to the development of LLM Serving packages (e.g., vLLM, SGLang, TGI, Triton-Inference Server, Dynamo, LLM-d). Working closely with customers to drive solutions by collaborating with internal compiler, firmware and platform teams. Working at the forefront of GenAI by understanding advanced algorithms (e.g., attention mechanisms, MoEs) and numerics to identify new optimization opportunities. Driving efficient serving through autoscaling, load balancing and routing. Engaging with open-source serving communities to evolve the framework. Demonstrating hands-on experience with LLM serving/orchestration packages (Triton-Inference Server, vLLM, SGLang, Ollama, llm-d, KServe, LMCache, MoonCake). Showing deep understanding of foundational LLMs, VLMs, SLMs, transformer-based architectures, and strong Python development skills for large-scale projects. Analyzing, profiling, and optimizing deep learning workloads; proactively learning the latest inference optimization techniques. Exemplifying excellent communication and problem-solving skills in a fast-paced, collaborative environment. Qualifications
Minimum Qualifications: • Bachelor’s degree in Computer Science, Engineering, Information Systems, or related field and 4+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience. OR Master’s degree in Computer Science, Engineering, Information Systems, or related field and 3+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience. OR PhD in Computer Science, Engineering, Information Systems, or related field and 2+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience. Qualcomm is an equal opportunity employer. If you are an individual with a disability and need an accommodation during the application/hiring process, Qualcomm is committed to providing an accessible process. You may e-mail disability-accomodations@qualcomm.com or call Qualcomm’s toll-free number found here. Upon request, Qualcomm will provide reasonable accommodations to support individuals with disabilities to participate in the hiring process. Qualcomm is also committed to making our workplace accessible for individuals with disabilities. Bonus Skills: Open-source contribution to any GenAI package. Experience architecting and developing large-scale distributed systems. High-level kernel design experience (PyTorch, CUDA, Triton). Knowledge of torch.compile or torchDynamo. PhD in Computer Science, Computer Engineering, or Machine Learning. Pay range and Other Compensation & Benefits:
$158,400.00 - $237,600.00 Note: The above pay scale reflects the broad minimum to maximum pay for the location posted. Salary is one component of total compensation, which also includes annual discretionary bonuses and RSU grants. Benefits details can be discussed with your recruiter. EEO Statement
Qualcomm is an equal opportunity employer; all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or any other protected classification. Qualcomm is committed to an accessible process for applicants with disabilities.
#J-18808-Ljbffr
Qualcomm Technologies, Inc.
Job Area:
Engineering Group, Engineering Group > Machine Learning Engineering
Overview
LLM Serving Engineer (Cloud AI Engineering) Qualcomm is utilizing its traditional strengths in digital wireless technologies to play a central role in the evolution of Cloud AI. We are investing in several supporting technologies including Deep Learning. The Qualcomm Cloud AI team is developing hardware and software solutions for Inference Acceleration. We are hiring LLM Serving Engineers at multiple levels to join our dynamic, collaborative team. This role spans the full product lifecycle—from cutting-edge research and development to commercial deployment—and demands strategic thinking, strong execution, and excellent communication skills. Responsibilities
Building a scalable LLM inference platform using inference techniques (e.g., disaggregated serving and KV-Cache management, advanced parallelism, speculative algorithms, model optimization, specialized kernels). Contributing to the development of LLM Serving packages (e.g., vLLM, SGLang, TGI, Triton-Inference Server, Dynamo, LLM-d). Working closely with customers to drive solutions by collaborating with internal compiler, firmware and platform teams. Working at the forefront of GenAI by understanding advanced algorithms (e.g., attention mechanisms, MoEs) and numerics to identify new optimization opportunities. Driving efficient serving through autoscaling, load balancing and routing. Engaging with open-source serving communities to evolve the framework. Demonstrating hands-on experience with LLM serving/orchestration packages (Triton-Inference Server, vLLM, SGLang, Ollama, llm-d, KServe, LMCache, MoonCake). Showing deep understanding of foundational LLMs, VLMs, SLMs, transformer-based architectures, and strong Python development skills for large-scale projects. Analyzing, profiling, and optimizing deep learning workloads; proactively learning the latest inference optimization techniques. Exemplifying excellent communication and problem-solving skills in a fast-paced, collaborative environment. Qualifications
Minimum Qualifications: • Bachelor’s degree in Computer Science, Engineering, Information Systems, or related field and 4+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience. OR Master’s degree in Computer Science, Engineering, Information Systems, or related field and 3+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience. OR PhD in Computer Science, Engineering, Information Systems, or related field and 2+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience. Qualcomm is an equal opportunity employer. If you are an individual with a disability and need an accommodation during the application/hiring process, Qualcomm is committed to providing an accessible process. You may e-mail disability-accomodations@qualcomm.com or call Qualcomm’s toll-free number found here. Upon request, Qualcomm will provide reasonable accommodations to support individuals with disabilities to participate in the hiring process. Qualcomm is also committed to making our workplace accessible for individuals with disabilities. Bonus Skills: Open-source contribution to any GenAI package. Experience architecting and developing large-scale distributed systems. High-level kernel design experience (PyTorch, CUDA, Triton). Knowledge of torch.compile or torchDynamo. PhD in Computer Science, Computer Engineering, or Machine Learning. Pay range and Other Compensation & Benefits:
$158,400.00 - $237,600.00 Note: The above pay scale reflects the broad minimum to maximum pay for the location posted. Salary is one component of total compensation, which also includes annual discretionary bonuses and RSU grants. Benefits details can be discussed with your recruiter. EEO Statement
Qualcomm is an equal opportunity employer; all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or any other protected classification. Qualcomm is committed to an accessible process for applicants with disabilities.
#J-18808-Ljbffr