Logo
Voiceflow

AI-Infrastructure-Engineer

Voiceflow, New York, New York, us, 10261

Save Job

AI-Infrastructure-Engineer Location:

San Francisco (Onsite) Type:

Full-time Start Date:

ASAP

What You'll Do Design and build infrastructure for deploying, scaling, and managing AI/ML workloads

Develop automation for GPU cluster provisioning, configuration, and orchestration

Build systems for hardware-aware model deployment and inference optimization

Create tooling for AI infrastructure observability, debugging, and performance tuning

Work on integration between hardware intelligence and ML frameworks

Collaborate with customers deploying large-scale AI systems in production

Optimize resource utilization across heterogeneous compute (GPUs, TPUs, custom accelerators)

What You Bring Strong experience with:

GPU cluster management and orchestration (SLURM, Kubernetes, Ray)

ML infrastructure and frameworks (PyTorch, TensorFlow, JAX, NVIDIA stack)

Distributed training and inference systems

Container orchestration for ML workloads (Docker, Kubernetes, KubeFlow)

Linux systems programming and performance optimization

Python and systems scripting

Familiarity with:

Hardware architectures for AI (NVIDIA GPUs, AMD GPUs, custom accelerators)

High-performance networking for distributed ML (NCCL, InfiniBand, RoCE)

Model serving infrastructure (Triton, vLLM, TensorRT)

Storage systems for ML workloads (distributed filesystems, object storage)

Infrastructure as Code and GitOps workflows

What We're Looking For We're looking for an AI infrastructure engineer who understands the full stack from silicon to model serving — and can build systems that make AI deployment effortless.

You should have:

Deep understanding of what it takes to run AI workloads at scale

Experience with the operational challenges of GPU clusters and ML infrastructure

Ability to debug performance issues across hardware, networking, and software

Comfort working across infrastructure, ML frameworks, and developer experience

Excitement about building the foundational layer for physical AI systems

Requirements:

Bachelor's or Master's in Computer Science, Computer Engineering, or equivalent experience

3+ years of experience in ML infrastructure, MLOps, or AI platform engineering

Willingness to work startup hours, in-person (weekends included) at our San Francisco office

Work authorization in the United States

Why Join We're building the intelligence layer for hardware — real-time systems that control physical machines with zero tolerance for latency or failure.

What we offer:

Startup-level equity and highly competitive salary

Ownership over AI infrastructure that powers next-generation systems

Problems at the intersection of hardware intelligence and machine learning

Close collaboration with customers pushing the boundaries of AI deployment

How to Apply Email:

team@cosmiclabs.io Subject line:

AI Infrastructure / [Your Name]

Include in your email:

Your name

Why this role and why Cosmic Labs

What you bring technically

Soonest available start date

GitHub or GitLab link

Confirmation of work authorization in the U.S.

Confirmation of willingness to work full-time, in-person in San Francisco

Attach:

PDF resume

#J-18808-Ljbffr