Logo
Amazon

Machine Learning - Compiler Engineer II, AWS Neuron, Annapurna Labs

Amazon, Austin, Texas, us, 78716

Save Job

Do you want to be part of the AI revolution? At AWS, our vision is to make deep learning pervasive for everyday developers and to democratize access to AI hardware and software infrastructure. Our AWS Neuron SDK optimizes the performance of complex ML models executed on AWS Inferentia and Trainium, custom chips designed to accelerate deep‑learning workloads. This role is for a software engineer in the Neuron Compiler team, responsible for building the next generation Neuron compiler that transforms models written in frameworks such as PyTorch, TensorFlow, and JAX into production‑grade code that runs efficiently on AWS Inferentia and Trainium-based servers. You will solve hard compiler optimisation problems to achieve optimum performance for a variety of ML model families—including large‑language models, vision transformers, stable diffusion, and multi‑model pipelines—by understanding these models inside‑out and designing compiler passes that generate the best instruction streams. You will partner with internal customers, external stakeholders, and chip architects to bring new products and features to market, ensuring the Neuron compiler is highly performant and easy to use.

Key Responsibilities • Design, implement, test, deploy, and maintain innovative software solutions that improve Neuron compiler performance, stability, and user‑interface. • Work side‑by‑side with chip architects, runtime/OS engineers, scientists, and ML application teams to deploy state‑of‑the‑art ML models on AWS accelerators with optimal cost/performance benefits. • Build and contribute to open‑source projects (e.g., StableHLO, OpenXLA, MLIR) to pioneer optimisation of advanced ML workloads on AWS software and hardware. • Develop compiler optimisation and verification passes, build tooling for numerical‑error analysis, and resolve root causes of compiler defects. • Participate in design discussions, code reviews, and communicate with internal and external stakeholders, including open‑source communities. • Work in a startup‑like environment, focusing on the most critical tasks that have the greatest impact.

About the Day in the Life You will design and code solutions that drive efficiencies in compiler architecture, creating optimisation and verification passes, building surface‑level APIs for AWS accelerators, implementing tools to analyze numerical errors, and resolving compiler defects. You will also participate in design discussions, code reviews, and communicate with both internal and external teams. The role is fast‑paced and requires a collaborative, self‑directed approach.

Basic Qualifications

3+ years of non‑internship professional software development experience

2+ years of non‑internship design or architecture experience (design patterns, reliability, and scaling) of new and existing systems

Experience programming with at least one object‑oriented language such as C++ or Java

Preferred Qualifications

Master's degree or PhD in Computer Science or a related technical field

3+ years of experience writing production‑grade code in C++/Java

Experience in compiler design for CPU/GPU/Vector engines or ML‑accelerators

Experience with open‑source compiler toolsets such as LLVM/MLIR

Experience with PyTorch, OpenXLA, StableHLO, JAX, TVM, deep learning models, and algorithms

Experience with modern build systems like Bazel or CMake

Bonus: knowledge of OpenXLA, StableHLO, MLIR and related technologies

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Explore the product and our history: https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-cc/indexhtml, https://aws.amazon.com/machine-learning/neuron/, https://github.com/aws/aws-neuron-sdk, https://www.amazon.science/how-silicon-innovation-became-the-secret-sauce-behind-awss-success.

#J-18808-Ljbffr