Cadre
Founding ML Engineer (SF hybrid/onsite)
Cadre, San Francisco, California, United States, 94199
Founding Machine Learning Engineer
Location:
San Francisco, CA (Onsite preferred, Remote considered for exceptional candidates) Type:
Full-time Visa Sponsorship:
Available for candidates already based in the U.S. Compensation:
Competitive salary + 0.5%-2% equity Start Date:
ASAP About Outspeed
Outspeed powers emotionally intelligent voice companions and agents with
emotion
and
memory -redefining the way humans interact with machines through real-time, expressive, and persistent voice interfaces.
We're solving some of the toughest problems in machine learning and systems engineering-from low-latency inference and scaling, to multi-user conversational memory and emotion modeling. If you're excited by the frontier of conversational AI, you'll be at home here.
Founded in 2024 and based in San Francisco, we're a tight-knit team of 4 building at the intersection of speech, emotion, and intelligence.
Learn more:
outspeed.com What You'll Do Own and scale key ML systems: speech models, memory, emotional synthesis, real-time transformers Optimize inference latency and throughput for streaming models Architect data pipelines for fine-tuning and continual learning Collaborate across voice UX, product, and backend infra to ship intelligent, responsive agents Push the limits of what's possible in conversational AI-from prototype to production You Might Be a Fit If You Have: 2-7 years of experience as an ML engineer, especially in real-time ML systems (voice, video, or interactive apps) Hands-on fluency with
PyTorch ,
CUDA , and pre-trained transformer models Proven experience optimizing streaming model inference performance Experience with voice interfaces or emotion-aware synthesis (bonus for Bark, Tortoise, etc.) Strong data engineering instincts-architecting + processing pipelines Familiarity with tools like
VLLM ,
SGLang , or similar inference engines Deep interest in expressive AI, latency-sensitive systems, and emotional computing A degree in CS or related field from a top-tier university (preferred) Bonus Points For: Prior experience at AI-native companies like Runway ML, Descript, AnyScale, DeepMind, etc. Open-source contributions (share your GitHub!) Previous startup/founding experience or hunger for 0→1 building Passion for real-time voice UX, multi-modal agents, and persistent memory architectures ⚠ What We're Not Looking For: 15-20 year veterans with unclear startup intent or urgency Big tech lifers (e.g. 7+ years at FAANG with no startup exposure) Candidates with certification-heavy resumes and minimal build experience Anyone not ready to start within 1 month Work Culture & Expectations Location:
Preferably SF-based or open to relocating. Remote is OK for exceptional global candidates. Pace:
Expect ~60 hour weeks. We value high energy and deep focus, with flexibility when it matters. Team:
Tiny but mighty. You'll be the 5th team member.
Location:
San Francisco, CA (Onsite preferred, Remote considered for exceptional candidates) Type:
Full-time Visa Sponsorship:
Available for candidates already based in the U.S. Compensation:
Competitive salary + 0.5%-2% equity Start Date:
ASAP About Outspeed
Outspeed powers emotionally intelligent voice companions and agents with
emotion
and
memory -redefining the way humans interact with machines through real-time, expressive, and persistent voice interfaces.
We're solving some of the toughest problems in machine learning and systems engineering-from low-latency inference and scaling, to multi-user conversational memory and emotion modeling. If you're excited by the frontier of conversational AI, you'll be at home here.
Founded in 2024 and based in San Francisco, we're a tight-knit team of 4 building at the intersection of speech, emotion, and intelligence.
Learn more:
outspeed.com What You'll Do Own and scale key ML systems: speech models, memory, emotional synthesis, real-time transformers Optimize inference latency and throughput for streaming models Architect data pipelines for fine-tuning and continual learning Collaborate across voice UX, product, and backend infra to ship intelligent, responsive agents Push the limits of what's possible in conversational AI-from prototype to production You Might Be a Fit If You Have: 2-7 years of experience as an ML engineer, especially in real-time ML systems (voice, video, or interactive apps) Hands-on fluency with
PyTorch ,
CUDA , and pre-trained transformer models Proven experience optimizing streaming model inference performance Experience with voice interfaces or emotion-aware synthesis (bonus for Bark, Tortoise, etc.) Strong data engineering instincts-architecting + processing pipelines Familiarity with tools like
VLLM ,
SGLang , or similar inference engines Deep interest in expressive AI, latency-sensitive systems, and emotional computing A degree in CS or related field from a top-tier university (preferred) Bonus Points For: Prior experience at AI-native companies like Runway ML, Descript, AnyScale, DeepMind, etc. Open-source contributions (share your GitHub!) Previous startup/founding experience or hunger for 0→1 building Passion for real-time voice UX, multi-modal agents, and persistent memory architectures ⚠ What We're Not Looking For: 15-20 year veterans with unclear startup intent or urgency Big tech lifers (e.g. 7+ years at FAANG with no startup exposure) Candidates with certification-heavy resumes and minimal build experience Anyone not ready to start within 1 month Work Culture & Expectations Location:
Preferably SF-based or open to relocating. Remote is OK for exceptional global candidates. Pace:
Expect ~60 hour weeks. We value high energy and deep focus, with flexibility when it matters. Team:
Tiny but mighty. You'll be the 5th team member.