The Rundown AI, Inc.
Engineering Manager, Capacity
The Rundown AI, Inc., San Francisco, California, United States, 94199
About the role
Anthropic’s Capacity team is looking for an Engineering Manager to own and manage cloud spend across a massively scaled, multi-cloud environment. You’ll work closely with research, engineering, and finance teams to ensure we have scalable systems for capacity management, high-quality data and insights for planning, and engineering roadmaps that deliver efficiency wins. Responsibilities:
Design, develop, and deliver capacity management systems for AI workloads on heterogeneous infrastructure Build and maintain robust attribution of usage and enable in-depth data-driven insights that are actionable Build a deep understanding of research and training workloads to accurately forecast infrastructure needs Oversee design and implementation of forecasting tools and software systems for managing billions of dollars in spend Proactively identify efficiency opportunities and collaborate with teams across the org to increase effective capacity for Anthropic Partner closely with Finance and leadership, providing detailed and clear capacity inputs for financial planning and strategic decision making You may be a good fit if you:
Have experience managing $XXXM to $XB in infrastructure spend Have experience working with public clouds (AWS, GCP, Azure, etc.) and/or hybrid on-prem, cloud environments Have experience setting up capacity management systems that scale with growing organizations Are comfortable leveraging data and have experience building observability for complex systems Have strong interpersonal skills that enable you to influence and build cross-organizational support for capacity initiatives Have familiarity with LLMs and a deep interest in learning more about research and model training workloads Strong candidates may also have some of the following:
Past experience managing capacity for AI research and production workloads Past experience partnering with senior leadership, both technical and non-technical, to drive company-level reporting and decision making
#J-18808-Ljbffr
Anthropic’s Capacity team is looking for an Engineering Manager to own and manage cloud spend across a massively scaled, multi-cloud environment. You’ll work closely with research, engineering, and finance teams to ensure we have scalable systems for capacity management, high-quality data and insights for planning, and engineering roadmaps that deliver efficiency wins. Responsibilities:
Design, develop, and deliver capacity management systems for AI workloads on heterogeneous infrastructure Build and maintain robust attribution of usage and enable in-depth data-driven insights that are actionable Build a deep understanding of research and training workloads to accurately forecast infrastructure needs Oversee design and implementation of forecasting tools and software systems for managing billions of dollars in spend Proactively identify efficiency opportunities and collaborate with teams across the org to increase effective capacity for Anthropic Partner closely with Finance and leadership, providing detailed and clear capacity inputs for financial planning and strategic decision making You may be a good fit if you:
Have experience managing $XXXM to $XB in infrastructure spend Have experience working with public clouds (AWS, GCP, Azure, etc.) and/or hybrid on-prem, cloud environments Have experience setting up capacity management systems that scale with growing organizations Are comfortable leveraging data and have experience building observability for complex systems Have strong interpersonal skills that enable you to influence and build cross-organizational support for capacity initiatives Have familiarity with LLMs and a deep interest in learning more about research and model training workloads Strong candidates may also have some of the following:
Past experience managing capacity for AI research and production workloads Past experience partnering with senior leadership, both technical and non-technical, to drive company-level reporting and decision making
#J-18808-Ljbffr