Isotron AI

Founding Engineer, Machine Learning

Isotron AI, San Francisco, California, United States, 94199

Job Description

The following information aims to provide potential candidates with a better understanding of the requirements for this role. Job Description About the Role We’re an early-stage stealth startup building a new kind of platform for generative media. Our mission is to enable the future of real-time generative applications : we’re building the foundational tools and infrastructure that make entirely new categories of generative experiences and applications finally possible. We’re a small, focused team of ex-YC and unicorn founders and senior engineers with deep experience across 3D, generative video, developer platforms, and creative tools. We're backed by top-tier investors and top angels, and we're building a new technical foundation purpose-built for the next era of generative media. We’re operating at the edge of what’s technically possible : high-performance inference and real-time orchestration of multimodal models. As one of our founding engineers, you’ll play a key role in architecting the core platform, shaping system design decisions, and owning critical infrastructure from day one. If you're excited about architecting and building high-performance infrastructure that empowers the next generation of developers and unlocks entirely new products categories, we’d love to talk. About the Role We're looking for a Founding Machine Learning Engineer to build the core infrastructure powering high-performance inference for generative media models, including diffusion and transformer architectures. You’ll be instrumental in designing low-latency, high-throughput systems that serve state-of-the-art models in real time. As an early technical leader, you'll shape both our systems and culture from day one. What You’ll Do Architect and implement the inference engine for diffusion transformer-based generative models Optimize model execution across the stack — memory, compute, and networking Drive performance engineering to minimize latency and maximize throughput Work closely with research to productionize new generative techniques and model variants Build the tools, services, and monitoring that make these systems robust and scalable Set the technical bar and help define engineering culture as an early team member Requirements About You 3+ years of experience building high-performance ML or systems infrastructureDeep fluency with PyTorch and production-grade PythonStrong understanding of GPU systems (CUDA, memory hierarchies, scheduling, etc.)Experience optimizing inference for generative models (e.g., diffusion, transformers)Bonus : Familiarity with Triton, CUDA, TensorRT, or model parallelism techniquesStartup-ready : you take ownership, move quickly, and solve hard problems end to end

Minimum Qualifications Strong Python + PyTorch skillsProven experience optimizing inference for generative modelsDeep systems knowledge, especially GPU performance tuningHigh agency and eagerness to build from scratch

Benefits Competitive SF salary and foundational team equity #J-18808-Ljbffr