Causal Labs, Inc.

Machine Learning - Infrastructure

Causal Labs, Inc., San Francisco, California, United States, 94199

About us Our mission is to build causal intelligence, starting with physics models to predict and control the weather.

We're building a small team driven by a deep passion and urgency to solve this civilizationally important problem.

Our founding team has led & shipped models across self‑driving cars, humanoid robotics, protein folding, and video generation at world‑class institutions including Google DeepMind, Cruise, Waymo, Meta, Nabla Bio, and Apple.

Responsibilities

Design, deploy, and maintain large distributed ML training and inference clusters

Develop efficient, scalable end‑to‑end pipelines to manage petabyte‑scale datasets and model training throughout the entire ML lifecycle

Research and test various training approaches including parallelization techniques and numerical precision trade‑offs across different model scales

Analyze, profile and debug low‑level GPU operations to optimize performance

Stay up‑to‑date on research to bring new ideas to work

What we’re looking for We value a relentless approach to problem‑solving, rapid execution, and the ability to quickly learn in unfamiliar domains.

Strong grasp of state‑of‑the‑art techniques for optimizing training and inference workloads

Demonstrated proficiency with distributed training frameworks (e.g. FSDP, DeepSpeed) to train large foundation models

Knowledge of cloud platforms (GCP, AWS, or Azure) and their ML/AI service offerings

Familiarity with containerization and orchestration frameworks (e.g., Kubernetes, Docker)

Background working on distributed task management systems and scalable model serving & deployment architectures

Understanding of monitoring, logging, observability, and version control best practices for ML systems

You don’t have to meet every single requirement above.

Benefits

Work on deeply challenging, unsolved problems

Competitive cash and equity compensation

Medical, dental, and vision insurance

Catered lunch & dinner

Unlimited paid time off

Visa sponsorship & relocation support

#J-18808-Ljbffr