Cadre Inc
Founding ML Engineer (SF hybrid/onsite)
Cadre Inc, San Francisco, California, United States, 94199
Founding Machine Learning Engineer
Location:
San Francisco, CA (Onsite preferred, Remote considered for exceptional candidates) Type:
Full-time Visa Sponsorship:
Available for candidates already based in the U.S. Compensation:
Competitive salary + 0.5%–2% equity Start Date:
ASAP
About Outspeed Outspeed powers emotionally intelligent voice companions and agents with
emotion
and
memory —redefining the way humans interact with machines through real-time, expressive, and persistent voice interfaces.
We’re solving some of the toughest problems in machine learning and systems engineering—from low-latency inference and scaling, to multi-user conversational memory and emotion modeling. If you’re excited by the frontier of conversational AI, you’ll be at home here.
Founded in 2024 and based in San Francisco, we're a tight-knit team of 4 building at the intersection of speech, emotion, and intelligence.
Learn more:
outspeed.com
What You'll Do
Own and scale key ML systems: speech models, memory, emotional synthesis, real-time transformers
Optimize inference latency and throughput for streaming models
Architect data pipelines for fine-tuning and continual learning
Collaborate across voice UX, product, and backend infra to ship intelligent, responsive agents
Push the limits of what's possible in conversational AI—from prototype to production
You Might Be a Fit If You Have:
2–7 years of experience as an ML engineer, especially in real-time ML systems (voice, video, or interactive apps)
Hands‑on fluency with
PyTorch ,
CUDA , and pre‑trained transformer models
Proven experience optimizing streaming model inference performance
Experience with voice interfaces or emotion‑aware synthesis (bonus for Bark, Tortoise, etc.)
Strong data engineering instincts—architecting + processing pipelines
Familiarity with tools like
VLLM ,
SGLang , or similar inference engines
Deep interest in expressive AI, latency‑sensitive systems, and emotional computing
A degree in CS or related field from a top‑tier university (preferred)
Bonus Points For:
Prior experience at AI‑native companies like Runway ML, Descript, AnyScale, DeepMind, etc.
Open‑source contributions (share your GitHub!)
Previous startup/founding experience or hunger for 0→1 building
Passion for real‑time voice UX, multi‑modal agents, and persistent memory architectures
⚠️ What We're Not Looking For:
15–20 year veterans with unclear startup intent or urgency
Big tech lifers (e.g. 7+ years at FAANG with no startup exposure)
Candidates with certification‑heavy resumes and minimal build experience
Anyone not ready to start within 1 month
Work Culture & Expectations
Location:
Preferably SF‑based or open to relocating. Remote is OK for exceptional global candidates.
Pace:
Expect ~60 hour weeks. We value high energy and deep focus, with flexibility when it matters.
Team:
Tiny but mighty. You’ll be the 5th team member.
#J-18808-Ljbffr
San Francisco, CA (Onsite preferred, Remote considered for exceptional candidates) Type:
Full-time Visa Sponsorship:
Available for candidates already based in the U.S. Compensation:
Competitive salary + 0.5%–2% equity Start Date:
ASAP
About Outspeed Outspeed powers emotionally intelligent voice companions and agents with
emotion
and
memory —redefining the way humans interact with machines through real-time, expressive, and persistent voice interfaces.
We’re solving some of the toughest problems in machine learning and systems engineering—from low-latency inference and scaling, to multi-user conversational memory and emotion modeling. If you’re excited by the frontier of conversational AI, you’ll be at home here.
Founded in 2024 and based in San Francisco, we're a tight-knit team of 4 building at the intersection of speech, emotion, and intelligence.
Learn more:
outspeed.com
What You'll Do
Own and scale key ML systems: speech models, memory, emotional synthesis, real-time transformers
Optimize inference latency and throughput for streaming models
Architect data pipelines for fine-tuning and continual learning
Collaborate across voice UX, product, and backend infra to ship intelligent, responsive agents
Push the limits of what's possible in conversational AI—from prototype to production
You Might Be a Fit If You Have:
2–7 years of experience as an ML engineer, especially in real-time ML systems (voice, video, or interactive apps)
Hands‑on fluency with
PyTorch ,
CUDA , and pre‑trained transformer models
Proven experience optimizing streaming model inference performance
Experience with voice interfaces or emotion‑aware synthesis (bonus for Bark, Tortoise, etc.)
Strong data engineering instincts—architecting + processing pipelines
Familiarity with tools like
VLLM ,
SGLang , or similar inference engines
Deep interest in expressive AI, latency‑sensitive systems, and emotional computing
A degree in CS or related field from a top‑tier university (preferred)
Bonus Points For:
Prior experience at AI‑native companies like Runway ML, Descript, AnyScale, DeepMind, etc.
Open‑source contributions (share your GitHub!)
Previous startup/founding experience or hunger for 0→1 building
Passion for real‑time voice UX, multi‑modal agents, and persistent memory architectures
⚠️ What We're Not Looking For:
15–20 year veterans with unclear startup intent or urgency
Big tech lifers (e.g. 7+ years at FAANG with no startup exposure)
Candidates with certification‑heavy resumes and minimal build experience
Anyone not ready to start within 1 month
Work Culture & Expectations
Location:
Preferably SF‑based or open to relocating. Remote is OK for exceptional global candidates.
Pace:
Expect ~60 hour weeks. We value high energy and deep focus, with flexibility when it matters.
Team:
Tiny but mighty. You’ll be the 5th team member.
#J-18808-Ljbffr