Logo
Beam

Machine Learning Engineer

Beam, New York, New York, us, 10261

Save Job

Beam

is an ultrafast AI inference platform. We built a serverless runtime that launches GPU-backed containers in less than 1 second and quickly scales out to thousands of GPUs. Developers use our platform to serve apps to millions of users around the globe. We're backed by Y Combinator, Tiger Global, and prominent developer-tool founders, including the founder of Snyk and former CTO of GitHub. Our team works in-person in

New York City , but we welcome remote applicants who are exceptionally qualified. About the Role

In this role, you'll optimize inference performance for a wide range of models running on our platform. You will minimize latency, maximize throughput, and continuously experiment to achieve industry-leading performance. Your work will directly impact millions of users worldwide. Skills & Experience

Experience with the state-of-the-art inference stack (e.g., PyTorch, TensorRT, vLLM) Familiar with modern AI workflows, like ComfyUI and LoRA adaptors for fine-tuning Deep understanding of model compilation, quantization, and serving architectures Familiarity with GPU architectures and comfort in diving into kernel-level optimizations to resolve performance bottlenecks Experience programming with CUDA, Triton, or similar low-level accelerator frameworks Benefits

Work on challenging and impactful engineering problems Competitive salary and meaningful equity Join a fast-growing pre-Series A company at the ground floor Health, dental, and vision benefits with 90% coverage for employees and 50% for dependents Opportunities to participate in events across the cloud-native and AI communities Fitness stipend, learning budget, and much more

#J-18808-Ljbffr