Logo
Anthropic

Software Engineer, Inference

Anthropic, New York, New York, us, 10261

Save Job

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a growing group of researchers, engineers, policy experts, and business leaders building beneficial AI systems. About the role

Our Inference team builds and maintains the critical systems that serve Claude to millions of users worldwide. We deploy models via the industry’s largest compute-agnostic inference deployments, handling the entire stack from intelligent request routing to fleet-wide orchestration across diverse AI accelerators. The team has a dual mandate:

maximizing compute efficiency

to support customer growth, and

enabling breakthrough research

by providing high-performance inference infrastructure to our scientists. We tackle complex, distributed systems challenges across multiple accelerator families, emergent AI hardware, and multiple cloud platforms. You may be a good fit if you:

Have significant software engineering experience, particularly with distributed systems Are results-oriented, with a bias towards flexibility and impact Pick up slack, even if it goes outside your job description Enjoy pair programming Want to learn more about machine learning systems and infrastructure Thrive where technical excellence drives both business results and research breakthroughs Care about the societal impacts of your work Strong candidates may also have experience with:

Implementing and deploying machine learning systems at scale Load balancing, request routing, or traffic management systems LLM inference optimization, batching, and caching strategies Kubernetes and cloud infrastructure (AWS, GCP) Python or Rust Representative projects:

Designing intelligent routing algorithms that optimize request distribution across thousands of accelerators Autoscaling compute fleets to match supply with demand across production, research, and experimental workloads Building production-grade deployment pipelines for releasing new models to millions of users Integrating new AI accelerator platforms to maintain hardware-agnostic advantage Contributing to new inference features (e.g., structured sampling, prompt caching) Analyzing observability data to tune performance for production workloads Managing multi-region deployments and geographic routing for global customers Compensation

The expected base compensation for this position is below. Our total compensation package for full-time employees includes equity, benefits, and may include incentive compensation. $300,000 - $485,000 USD Logistics

Education requirements:

We require at least a Bachelor's degree in a related field or equivalent experience. Location-based hybrid policy:

Currently, we expect all staff to be in one of our offices at least 25% of the time. Some roles may require more time in our offices. Visa sponsorship:

We sponsor visas. If we offer you a role, we will make reasonable efforts to support your visa application with an immigration lawyer as needed. Equal opportunity

and

inclusion : We encourage you to apply even if you do not meet every listed qualification. We value diverse perspectives and strive to include a range of candidates on our team. How we’re different

We pursue high-impact AI research as a coordinated team on a few large-scale efforts. We value impact and the long-term goals of steerable, trustworthy AI, and prioritize collaboration, strong communication, and empirical progress. Come work with us!

Anthropic is a public benefit corporation with offices in San Francisco. We offer competitive compensation and benefits, flexible working hours, generous vacation and parental leave, and a collaborative office space.

#J-18808-Ljbffr