Logo
Big Cloud

Machine Learning Engineer

Big Cloud, San Francisco, California, United States, 94199

Save Job

Senior Consultant | AI / Robotics and Autonomous Systems I'm looking for a

hands-on ML Infrastructure Engineer

to help scale and optimize large-scale training systems for robotics and AI. This is a high-impact role working close to the GPUs, driving inference, ML Ops, and distributed training at scale.

What you’ll do:

Build and maintain infrastructure for large-scale training (scheduling, orchestration, checkpointing, metrics).

Scale JAX-based pipelines across GPU/TPU clusters for high-throughput experiments.

Optimize performance across data pipelines, model loops, and distributed sync.

Partner with researchers to turn ideas into production-ready training runs.

What we’re looking for:

Strong software engineering skills in ML infrastructure/platforms.

Hands-on experience with JAX (preferred), PyTorch, or TensorFlow.

Proven expertise in distributed training and performance optimization.

Strong communicator who thrives collaborating with researchers and engineers.

A scrappy, ownership-driven builder who loves scaling systems fast.

This is a rare chance to work at the intersection of

foundation models and robotics , helping shape the future of physical AI.

Seniority level Mid-Senior level

Employment type Full-time

Job function Staffing and Recruiting

#J-18808-Ljbffr