Cartesia
Product Manager, Model Behavior
Cartesia is building the next generation of AI: ubiquitous, interactive intelligence that runs wherever you are. Today, no model can continuously process and reason over a year‑long stream of audio, video and text—1B text tokens, 10B audio tokens, 1T video tokens—let alone do this on‑device.
We’re pioneering model architectures that will make this possible, leveraging innovations such as State Space Models (SSMs) and deep expertise in model innovation and systems engineering.
We are funded by leading investors including Index Ventures, Lightspeed Venture Partners, Factory, Conviction, A Star, General Catalyst, SV Angel, Databricks and others.
About The Role We’re seeking a Product Manager to drive model quality and behavior excellence for our text‑to‑speech (TTS) and speech‑to‑text (STT) products. As the Model Behavior PM, you’ll bridge customer needs and model development teams, defining what world‑class TTS and STT models should sound like, perform like, and feel like.
Your Impact
Define and evolve evaluation frameworks for TTS and STT model behavior, establishing metrics for naturalness, accuracy, prosody, emotion, latency, and user satisfaction.
Conduct competitive analysis, identifying quality gaps and differentiation opportunities.
Partner with data teams to design data collection strategies, labeling guidelines, and dataset curation that improve model behavior.
Collaborate with evaluation teams to build testing methodologies, automated pipelines and human evaluation protocols to catch edge cases and regressions.
Engage customers across industries to gather feedback and translate insights into product improvements.
Align research, engineering, data and GTM teams to prioritize and execute model behavior improvements.
Build a deep intuition for what makes TTS and STT models great, championing quality standards across the organization.
Create frameworks, documentation, and best practices to help teams and customers understand model capabilities and limits.
What You Bring
6+ years of product management experience with technical products, preferably in AI/ML, audio or speech technologies.
Strong analytical mindset, experience designing evaluation frameworks and making data‑driven quality decisions.
Deep customer empathy, proven ability to conduct user research and translate needs into product requirements.
Technical fluency to work with ML researchers, data scientists and engineers—understanding model behavior at a detailed level.
Exceptional attention to detail and quality standards, noticing subtle differences in model outputs.
Experience working cross‑functionally with data, engineering and evaluation/testing teams.
Strong communication skills to advocate for quality and influence technical teams toward customer‑centric decisions.
Nice to Have
Direct experience with speech technologies (TTS, STT, voice cloning, conversational AI).
Background in linguistics, audio engineering or speech sciences.
Experience with ML model evaluation, A/B testing methodologies or human evaluation design.
Familiarity with audio quality metrics (MOS, WER, CER, prosody analysis).
Prior experience at a company known for exceptional product quality.
About Compensation Location: San Francisco, CA
Annual base salary: $142,000–$201,000 (variable). Relocation and immigration support provided.
Benefits include full medical, dental and vision coverage, 401(k), paid time off, lunch/dinner/snacks, and a supportive culture.
#J-18808-Ljbffr
We’re pioneering model architectures that will make this possible, leveraging innovations such as State Space Models (SSMs) and deep expertise in model innovation and systems engineering.
We are funded by leading investors including Index Ventures, Lightspeed Venture Partners, Factory, Conviction, A Star, General Catalyst, SV Angel, Databricks and others.
About The Role We’re seeking a Product Manager to drive model quality and behavior excellence for our text‑to‑speech (TTS) and speech‑to‑text (STT) products. As the Model Behavior PM, you’ll bridge customer needs and model development teams, defining what world‑class TTS and STT models should sound like, perform like, and feel like.
Your Impact
Define and evolve evaluation frameworks for TTS and STT model behavior, establishing metrics for naturalness, accuracy, prosody, emotion, latency, and user satisfaction.
Conduct competitive analysis, identifying quality gaps and differentiation opportunities.
Partner with data teams to design data collection strategies, labeling guidelines, and dataset curation that improve model behavior.
Collaborate with evaluation teams to build testing methodologies, automated pipelines and human evaluation protocols to catch edge cases and regressions.
Engage customers across industries to gather feedback and translate insights into product improvements.
Align research, engineering, data and GTM teams to prioritize and execute model behavior improvements.
Build a deep intuition for what makes TTS and STT models great, championing quality standards across the organization.
Create frameworks, documentation, and best practices to help teams and customers understand model capabilities and limits.
What You Bring
6+ years of product management experience with technical products, preferably in AI/ML, audio or speech technologies.
Strong analytical mindset, experience designing evaluation frameworks and making data‑driven quality decisions.
Deep customer empathy, proven ability to conduct user research and translate needs into product requirements.
Technical fluency to work with ML researchers, data scientists and engineers—understanding model behavior at a detailed level.
Exceptional attention to detail and quality standards, noticing subtle differences in model outputs.
Experience working cross‑functionally with data, engineering and evaluation/testing teams.
Strong communication skills to advocate for quality and influence technical teams toward customer‑centric decisions.
Nice to Have
Direct experience with speech technologies (TTS, STT, voice cloning, conversational AI).
Background in linguistics, audio engineering or speech sciences.
Experience with ML model evaluation, A/B testing methodologies or human evaluation design.
Familiarity with audio quality metrics (MOS, WER, CER, prosody analysis).
Prior experience at a company known for exceptional product quality.
About Compensation Location: San Francisco, CA
Annual base salary: $142,000–$201,000 (variable). Relocation and immigration support provided.
Benefits include full medical, dental and vision coverage, 401(k), paid time off, lunch/dinner/snacks, and a supportive culture.
#J-18808-Ljbffr