Majestic Labs ai
Senior Compiler Engineer (PyTorch / Triton / LLVM / MLIR)
Majestic Labs ai, Los Altos, California, United States, 94024
Role Description
We are looking for an experienced, highly skilled and motivated Senior Compiler Engineer to join our compiler team. In this role, you will be responsible for designing, developing, and maintaining critical components of our AI-driven compilation stack. You will work on PyTorch, Triton, LLVM, and MLIR to build robust, scalable, and high-performance solutions for diverse hardware backends. This is a hands‑on technical position where you will solve complex problems, optimize for performance, and contribute to the next generation of our technology.
What You’ll Do
Design, Implement, and Optimize Compiler Components: Architect and develop critical compiler modules to efficiently translate and optimize AI models for deployment across a variety of hardware platforms, including CPUs, GPUs, and emerging custom accelerators
Enhance and unify PyTorch Inductor, Triton, LLVM, and MLIR toolchains to support cutting‑edge architectures, facilitate seamless interoperability, and enable rapid experimentation with new compiler features
Create and maintain custom IRs, code generation passes, and optimization strategies tailored for AI workloads, focusing on both general and domain‑specific improvements
Profile and tune computational kernels—such as linear algebra operations, matrix multiplications, and elementwise computations—to achieve optimal performance, scalability, and resource efficiency on diverse hardware
Open‑Source Engagement: Actively contribute to the LLVM, MLIR, Triton, and PyTorch open‑source projects, sharing improvements, collaborating with the developer community, and driving the evolution of the AI compiler ecosystem
Requirements
Bachelor’s or Master’s in Computer Science, Computer Engineering, or a related field from a recognized university
10+ years experience in compiler engineering or closely related fields
Deep knowledge of LLVM and MLIR internals, including IR transformations, code generation, and backend optimization techniques
Experience with PyTorch (Inductor/Dynamo) and Triton, including their compiler subsystems
Proven experience and expertise in C/C++ programming
Demonstrated expertise in performance optimization, including vectorization, parallelization, and hardware‑specific tuning
Advanced debugging, analytical, and system‑level thinking skills
Excellent communication skills with a strong track record of cross‑functional collaboration
Ways to stand out from the crowd
Strong understanding of AI models – both training and inference pipelines
Experience developing compiler support for custom hardware accelerators, including ASICs, FPGAs, or novel AI chips
Active contributions to open‑source compiler frameworks, demonstrating leadership and community involvement
Familiarity with distributed training strategies, graph compilers, and advanced memory models
#J-18808-Ljbffr
What You’ll Do
Design, Implement, and Optimize Compiler Components: Architect and develop critical compiler modules to efficiently translate and optimize AI models for deployment across a variety of hardware platforms, including CPUs, GPUs, and emerging custom accelerators
Enhance and unify PyTorch Inductor, Triton, LLVM, and MLIR toolchains to support cutting‑edge architectures, facilitate seamless interoperability, and enable rapid experimentation with new compiler features
Create and maintain custom IRs, code generation passes, and optimization strategies tailored for AI workloads, focusing on both general and domain‑specific improvements
Profile and tune computational kernels—such as linear algebra operations, matrix multiplications, and elementwise computations—to achieve optimal performance, scalability, and resource efficiency on diverse hardware
Open‑Source Engagement: Actively contribute to the LLVM, MLIR, Triton, and PyTorch open‑source projects, sharing improvements, collaborating with the developer community, and driving the evolution of the AI compiler ecosystem
Requirements
Bachelor’s or Master’s in Computer Science, Computer Engineering, or a related field from a recognized university
10+ years experience in compiler engineering or closely related fields
Deep knowledge of LLVM and MLIR internals, including IR transformations, code generation, and backend optimization techniques
Experience with PyTorch (Inductor/Dynamo) and Triton, including their compiler subsystems
Proven experience and expertise in C/C++ programming
Demonstrated expertise in performance optimization, including vectorization, parallelization, and hardware‑specific tuning
Advanced debugging, analytical, and system‑level thinking skills
Excellent communication skills with a strong track record of cross‑functional collaboration
Ways to stand out from the crowd
Strong understanding of AI models – both training and inference pipelines
Experience developing compiler support for custom hardware accelerators, including ASICs, FPGAs, or novel AI chips
Active contributions to open‑source compiler frameworks, demonstrating leadership and community involvement
Familiarity with distributed training strategies, graph compilers, and advanced memory models
#J-18808-Ljbffr