NVIDIA

LLM Inference Performance Engineer - GPU Compiler Optimizer

NVIDIA, Austin, Texas, us, 78716

A leading technology company in Austin seeks a Software Engineer focused on performance analysis and optimization for LLM inference. In this role, you'll improve the efficiency of large language models (LLMs) on NVIDIA platforms through compiler and kernel analyses. Key responsibilities include analyzing performance bottlenecks, designing new compiler passes, and collaborating with teams on cutting-edge technology. The ideal candidate will have a Master’s or PhD in Computer Science, strong programming skills in C++ and Python, and experience with deep learning frameworks. Competitive salaries and benefits offered. #J-18808-Ljbffr