Amazon

Senior Software Engineer - AI Inference Engine

Amazon, Bellevue, Washington, us, 98009

Are you passionate about pushing the boundaries of Generative AI?

Join our dynamic team of Machine Learning engineers and scientists dedicated to developing cutting-edge AI technologies that enhance Amazon's offerings for businesses and customers alike. Key Responsibilities: Take charge of designing, developing, testing, and deploying high-performance inference capabilities, focusing on multi-modality, state-of-the-art model architectures, latency, throughput, and cost-efficiency. Collaborate closely with engineers and scientists to influence strategic direction and define the team's roadmap. Drive innovative system architecture, promote best practices, and mentor junior engineers. A Day in the Life: Stay ahead of the curve by reading research papers and consulting with scientists on emerging techniques to enhance our projects. Design and experiment with new algorithms, rigorously benchmark latency and accuracy of implementations. Develop production-grade solutions and ensure swift deployment. Collaborate with other engineering and science teams to achieve project goals effectively. Uphold high standards of operational excellence, actively supporting production systems and creating solutions to reduce operational load. About Our Team:

We aim to create superior, efficient, and cost-effective large language model inference solutions and infrastructure that empower Amazon businesses to provide outstanding value to their customers. Basic Qualifications: 5+ years of professional software development experience. 5+ years of experience programming in at least one software language. 5+ years of experience in leading design or architecture of systems. Experience mentoring or leading an engineering team. Familiarity with software performance optimization or knowledge of Machine Learning and Deep Learning. Preferred Qualifications: Bachelor's degree in Computer Science or a related field. Experience with Large Language Model inference. Proficiency in GPU programming (TensorRT-LLM) or Amazon AI chip programming (Trainium). Strong programming skills in Python, PyTorch, and C++ with a focus on performance optimization. Compensation:

Our compensation reflects the labor market across various U.S. regions, with a base salary range of $151,300 to $261,500 depending on location and experience. Additional forms of compensation may also be part of your total package, including equity and sign-on bonuses, along with a comprehensive array of medical and financial benefits. This position will remain open until filled. Interested candidates should apply through our career site.