Amazon
We're enhancing the shopping experience on Amazon through the conversational capabilities of large language models, and we're looking for innovative professionals who are passionate about technology and customer experience. You'll have the opportunity to contribute meaningfully to the industry while working alongside talented scientists, engineers, and technical program managers (TPMs) to create solutions that serve our customers.
If you're excited about collaborating with a dynamic team and contributing to this evolving field, we'd love to have you join us on this journey!
Key job responsibilities
We're looking for an experienced Software Development Manager with expertise in ML inference engine and optimization to guide a talented team in architecting, designing, developing, and enhancing high-performance, test-driven code and recipe for large-language model inference that is scalable and maintainable. You'll collaborate with your team to create innovative solutions at scale, exploring new technological and scientific possibilities.
In this role, you'll help establish best practices that reduce latency and improve throughput for large-language model inference. You'll work with your team to develop efficient inference optimization solutions at scale, partnering with technical and business leaders in a collaborative environment to create value for our customers.
You will also contribute to prioritization, estimation, and sprint planning activities.
Basic Qualifications
3+ years of engineering team management experience
5+ years of engineering experience
Knowledge of engineering practices and patterns for the full software/hardware/networks development life cycle, including coding standards, code reviews, source control management, build processes, testing, certification, and livesite operations
Experience partnering with product or program management teams
Experience managing a team of high calibre Software Engineers developing complex, world class, scalable software systems that have been successfully delivered to customers
Experience in recruiting, hiring, mentoring/coaching and managing teams of Software Engineers to improve their skills, and make them more effective, product software engineers
Experience with high-performance computing (low latency and/or high throughput) or real-time systems using ML model
Experience in ML DevOps practices and infrastructure as code
Preferred Qualifications
Experience with Machine and Deep Learning toolkits such as MXNet, TensorFlow, Caffe and PyTorch
Experience in ML optimization with GPU or other specialized chips (such as TPU and Neuron hardware family)
Experience with LLM serving framework and engine such as vLLM, TRT-LLM and SGLang
Knowledge of ML model optimization techniques (quantization, pruning, distillation)
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $166,400/year in our lowest geographic market up to $287,700/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits.
This position will remain posted until filled. Applicants should apply via our internal or external career site.
Posted:
October 14, 2025 (Updated 1 day ago)
#J-18808-Ljbffr
If you're excited about collaborating with a dynamic team and contributing to this evolving field, we'd love to have you join us on this journey!
Key job responsibilities
We're looking for an experienced Software Development Manager with expertise in ML inference engine and optimization to guide a talented team in architecting, designing, developing, and enhancing high-performance, test-driven code and recipe for large-language model inference that is scalable and maintainable. You'll collaborate with your team to create innovative solutions at scale, exploring new technological and scientific possibilities.
In this role, you'll help establish best practices that reduce latency and improve throughput for large-language model inference. You'll work with your team to develop efficient inference optimization solutions at scale, partnering with technical and business leaders in a collaborative environment to create value for our customers.
You will also contribute to prioritization, estimation, and sprint planning activities.
Basic Qualifications
3+ years of engineering team management experience
5+ years of engineering experience
Knowledge of engineering practices and patterns for the full software/hardware/networks development life cycle, including coding standards, code reviews, source control management, build processes, testing, certification, and livesite operations
Experience partnering with product or program management teams
Experience managing a team of high calibre Software Engineers developing complex, world class, scalable software systems that have been successfully delivered to customers
Experience in recruiting, hiring, mentoring/coaching and managing teams of Software Engineers to improve their skills, and make them more effective, product software engineers
Experience with high-performance computing (low latency and/or high throughput) or real-time systems using ML model
Experience in ML DevOps practices and infrastructure as code
Preferred Qualifications
Experience with Machine and Deep Learning toolkits such as MXNet, TensorFlow, Caffe and PyTorch
Experience in ML optimization with GPU or other specialized chips (such as TPU and Neuron hardware family)
Experience with LLM serving framework and engine such as vLLM, TRT-LLM and SGLang
Knowledge of ML model optimization techniques (quantization, pruning, distillation)
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $166,400/year in our lowest geographic market up to $287,700/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits.
This position will remain posted until filled. Applicants should apply via our internal or external career site.
Posted:
October 14, 2025 (Updated 1 day ago)
#J-18808-Ljbffr