NVIDIA

LLM Inference Performance Engineer — GPU Optimization, Equity

NVIDIA, New York, New York, us, 10261

A leading technology company in New York is looking for a Software Engineer specialized in Performance Analysis and Optimization for LLM Inference. The ideal candidate will work on enhancing the efficiency and scalability of large language models through compiler and kernel-level optimizations. Responsibilities include analyzing performance bottlenecks and collaborating with various teams to improve runtime behavior. Candidates should have a Master’s or PhD in a relevant field and strong programming experience in C++ and Python. A competitive salary range from $124,000 to $195,500 is offered, with additional equity and benefits. #J-18808-Ljbffr