NVIDIA
Principal Software Engineer – Large-Scale LLM Memory and Storage Systems
NVIDIA, Myrtle Point, Oregon, United States, 97458
Employer Industry: Technology - Artificial Intelligence and Machine Learning
Why consider this job opportunity
Salary range from $272,000 to $425,500
Opportunity for equity and comprehensive benefits package
Work with some of the most forward‑thinking and hardworking individuals in the technology sector
Chance to lead and mentor a team of engineers in a fast‑growing specialized engineering environment
Engage in innovative projects that push the boundaries of AI and memory management
What to Expect (Job Responsibilities)
Design and evolve a unified memory layer for large‑scale LLM inference across multiple memory types
Architect and implement integrations with leading LLM serving engines focusing on KV‑cache management
Co‑design interfaces and protocols for efficient KV‑cache sharing in disaggregated cluster environments
Collaborate closely with GPU architecture and networking teams to enhance low‑latency memory access
Mentor engineers and represent the team in technical reviews and external forums
What is Required (Qualifications)
Master's degree, PhD, or equivalent experience
15+ years of experience in building large‑scale distributed systems or ML systems infrastructure in C/C++ and Python
Deep understanding of memory hierarchies and experience with multi‑tier system design
Knowledge of distributed caching or key‑value systems optimized for low latency and high concurrency
Strong skills in profiling and optimizing systems across CPU, GPU, memory, and network
How to Stand Out (Preferred Qualifications)
Contributions to open‑source LLM serving or systems projects focusing on KV‑cache optimization
Experience in designing unified memory or storage layers in enterprise or hyperscale environments
Publications or patents related to LLM systems or memory‑disaggregated architectures
#ArtificialIntelligence #MachineLearning #TechnologyJobs #CompetitiveSalary #EngineeringLeadership
We prioritize candidate privacy and champion equal‑opportunity employment. Central to our mission is our partnership with companies that share this commitment. We aim to foster a fair, transparent, and secure hiring environment for all. If you encounter any employer not adhering to these principles, please bring it to our attention immediately.
We are not the EOR (Employer of Record) for this position. Our role in this specific opportunity is to connect outstanding candidates with a top‑tier employer.
#J-18808-Ljbffr
Why consider this job opportunity
Salary range from $272,000 to $425,500
Opportunity for equity and comprehensive benefits package
Work with some of the most forward‑thinking and hardworking individuals in the technology sector
Chance to lead and mentor a team of engineers in a fast‑growing specialized engineering environment
Engage in innovative projects that push the boundaries of AI and memory management
What to Expect (Job Responsibilities)
Design and evolve a unified memory layer for large‑scale LLM inference across multiple memory types
Architect and implement integrations with leading LLM serving engines focusing on KV‑cache management
Co‑design interfaces and protocols for efficient KV‑cache sharing in disaggregated cluster environments
Collaborate closely with GPU architecture and networking teams to enhance low‑latency memory access
Mentor engineers and represent the team in technical reviews and external forums
What is Required (Qualifications)
Master's degree, PhD, or equivalent experience
15+ years of experience in building large‑scale distributed systems or ML systems infrastructure in C/C++ and Python
Deep understanding of memory hierarchies and experience with multi‑tier system design
Knowledge of distributed caching or key‑value systems optimized for low latency and high concurrency
Strong skills in profiling and optimizing systems across CPU, GPU, memory, and network
How to Stand Out (Preferred Qualifications)
Contributions to open‑source LLM serving or systems projects focusing on KV‑cache optimization
Experience in designing unified memory or storage layers in enterprise or hyperscale environments
Publications or patents related to LLM systems or memory‑disaggregated architectures
#ArtificialIntelligence #MachineLearning #TechnologyJobs #CompetitiveSalary #EngineeringLeadership
We prioritize candidate privacy and champion equal‑opportunity employment. Central to our mission is our partnership with companies that share this commitment. We aim to foster a fair, transparent, and secure hiring environment for all. If you encounter any employer not adhering to these principles, please bring it to our attention immediately.
We are not the EOR (Employer of Record) for this position. Our role in this specific opportunity is to connect outstanding candidates with a top‑tier employer.
#J-18808-Ljbffr