Cartesia

Product Manager, Model Behavior

Cartesia, San Francisco, California, United States, 94199

About Cartesia

Our mission is to build the next generation of AI: ubiquitous, interactive intelligence that runs wherever you are. Today, not even the best models can continuously process and reason over a year-long stream of audio, video and text—1B text tokens, 10B audio tokens and 1T video tokens—let alone do this on-device. We’re pioneering the model architectures that will make this possible. Our founding team met as PhDs at the Stanford AI Lab, where we invented State Space Models or SSMs, a new primitive for training efficient, large-scale foundation models. Our team combines deep expertise in model innovation and systems engineering paired with a design-minded product engineering team to build and ship cutting edge models and experiences. We’re funded by leading investors at Index Ventures and Lightspeed Venture Partners, along with Factory, Conviction, A Star, General Catalyst, SV Angel, Databricks and others. We’re fortunate to have the support of many amazing advisors, and 90+ angels across many industries, including the world’s foremost experts in AI. About the Role

We’re seeking an exceptional

Product Manager

to drive model quality and behavior excellence for our text-to-speech and speech-to-text products at Cartesia. As our Model Behavior PM, you’ll be the bridge between our customers’ needs and our model development teams, defining what world-class TTS and STT models should sound like, perform like, and feel like. This role combines deep analytical rigor with customer empathy to continuously elevate our model quality and establish Cartesia as the gold standard in voice AI. Your Impact

Define and evolve comprehensive evaluation frameworks for TTS and STT model behavior, establishing clear metrics for naturalness, accuracy, prosody, emotion, latency, and user satisfaction across diverse use cases

Conduct systematic competitive analysis by deeply using our products alongside competitors’ offerings, identifying quality gaps, behavioral differences, and opportunities for differentiation

Partner closely with data teams to design data collection strategies, labeling guidelines, and dataset curation approaches that directly improve model behavior and performance

Collaborate with evaluation teams to build rigorous testing methodologies, automated evaluation pipelines, and human evaluation protocols that catch edge cases and quality regressions

Engage directly with customers across industries to understand their voice AI requirements, gather qualitative feedback on model behavior, and translate insights into actionable product improvements

Drive cross-functional alignment between research, engineering, data, and GTM teams to prioritize and execute on model behavior improvements that deliver maximum customer impact

Build a deep intuition for what makes TTS and STT models truly great—from subtle pronunciation nuances to handling of edge cases—and champion quality standards across the organization

Create frameworks, documentation, and best practices that help internal teams and customers understand model capabilities, limitations, and optimal usage patterns

What You Bring

6+ years of product management experience with technical products, preferably in AI/ML, audio, or speech technologies

Strong analytical mindset with experience designing evaluation frameworks, defining success metrics, and making data-driven quality decisions

Deep customer empathy with proven ability to conduct user research, synthesize qualitative feedback, and translate needs into product requirements

Technical fluency to work effectively with ML researchers, data scientists, and engineers—understanding model behavior at a detailed level

Exceptional attention to detail and quality standards, with the ability to notice subtle differences in model outputs and articulate what makes one better than another

Experience working cross-functionally with data teams, engineering teams, and evaluation/testing teams

Strong communication skills to advocate for quality and influence technical teams toward customer-centric decisions

Nice to Have

Direct experience with speech technologies (TTS, STT, voice cloning, or conversational AI)

Background in linguistics, audio engineering or speech sciences

Experience with ML model evaluation, A/B testing methodologies, or human evaluation design

Familiarity with audio quality metrics (MOS, WER, CER, prosody analysis)

Prior experience at a company known for exceptional product quality and attention to detail

What We Offer

Lunch, dinner and snacks at the office Fully covered medical, dental, and vision insurance for employees 401(k) ✈️ Relocation and immigration support Your own personal Yoshi Our culture

We’re an in-person team based out of San Francisco. We love being in the office, hanging out together and learning from each other everyday. We ship fast. All of our work is novel and cutting edge, and execution speed is paramount. We have a high bar, and we don’t sacrifice quality and design along the way. We support each other. We have an open and inclusive culture that’s focused on giving everyone the resources they need to succeed.

#J-18808-Ljbffr