Nuance Labs
Founding Research Scientist - Speech Synthesis
Join to apply for the
Founding Research Scientist - Speech Synthesis
role at
Nuance Labs
About Nuance Labs
Nuance Labs is an early-stage deep tech startup. We’re building the first real-time human foundation model — unifying text, speech, and vision — to make AI socially and emotionally intelligent. Imagine an AI that can understand a quirked eyebrow, a shift in tone, or a hesitant pause, and respond in a way that feels truly human.
Key Facts
A world-class team of PhDs from MIT, UW, and Oxford with decades of industry experience at Apple and Meta, advancing real-time avatars from cutting-edge research to products used by millions.
$10M seed round backed by Accel, South Park Commons, Lightspeed, and top angels including Synthesia’s former CPO.
In-person collaboration, 5 days a week at Seattle HQ
This is for you, if
Have a PhD (or equivalent experience) in training speech synthesis models (text-to-speech, speech-to-speech, etc.), training audio generation models, or related fields, with a track record of pushing the research frontier
Know deep learning inside out and can run the whole ML pipeline, from data wrangling and rapid prototyping to large-scale training, benchmarking, and evaluation
Love blank-page problems, chart your own course, and make progress without waiting for someone to hand you a task list
Move quickly from research breakthroughs to practical, real-world applications
Write code that’s clean enough your future self will thank you for
Play well with other brilliant minds from different domains
What you’ll be building
The first human foundation model that operates across text, speech, facial expression, and body language in real time. This unified model:
Understands fine-grained human signals — from a quirked eyebrow to a subtle change in voice — and infers meaning in context
Generates lifelike, responsive avatars whose expressions, gestures, and tone evolve frame-by-frame to deliver genuine responses
The landscape is ripe for innovation. While voice AI systems have made great strides in capturing prosody, and avatar platforms can generate compelling visuals, existing solutions remain fragmented. Real-time, multimodal interaction — where voice, facial expression, and contextual perception converge — is still an unsolved problem. This role offers the rare opportunity to shape foundational technology in a space where the boundaries are still being defined.
Why this team
We’re research scientists who’ve spent years advancing AI avatar and audio-visual generation — publishing at top conferences and shipping ultra-low-latency ML products to millions. We combine frontier research with the ruthless engineering needed for consumer-grade, real-time systems.
Fangchang Ma: CEO. MIT PhD in Robotics and ML. Previously Research Manager at Apple, with experience at DJI. Published in top AI conferences with 2400+ citations.
Edward Zhang: CTO. UW PhD in Computer Graphics. Previously Senior research scientist at Apple, with experience at Google, Microsoft.
Karren Yang: Founding Research Scientist. MIT PhD. Previously senior research scientist at Apple, with experience at Niantic Labs, Meta Reality Labs, Bosch Center for AI, and Adobe Research. Published in top AI conferences with 1800+ citations.
Claudia Vanea: Founding Research Scientist. Oxford PhD in AI. Previously founder at South Park Commons. Developed first AI computational framework in her field with performance surpassing domain experts. Published in top AI conferences and scientific journals.
Yaser Sheikh: Advisor. Former Vice President of Codec Avatars at Meta. Consulting professor at the Robotics Institute, Carnegie Mellon University.
We’re a small, fast-moving research team with an exceedingly high bar — bringing on only the very best talent. Every member has massive ownership, deep trust, and the opportunity to shape both the technology and the company from the ground up.
How to Apply
Email careers@nuancelabs.ai with your CV and a short note on why you are a strong fit.
Seniority level
Mid-Senior level
Employment type
Full-time Industries: Software Development
Referrals increase your chances of interviewing at Nuance Labs by 2x
Get notified about new Research Scientist jobs in
Seattle, WA .
#J-18808-Ljbffr
Founding Research Scientist - Speech Synthesis
role at
Nuance Labs
About Nuance Labs
Nuance Labs is an early-stage deep tech startup. We’re building the first real-time human foundation model — unifying text, speech, and vision — to make AI socially and emotionally intelligent. Imagine an AI that can understand a quirked eyebrow, a shift in tone, or a hesitant pause, and respond in a way that feels truly human.
Key Facts
A world-class team of PhDs from MIT, UW, and Oxford with decades of industry experience at Apple and Meta, advancing real-time avatars from cutting-edge research to products used by millions.
$10M seed round backed by Accel, South Park Commons, Lightspeed, and top angels including Synthesia’s former CPO.
In-person collaboration, 5 days a week at Seattle HQ
This is for you, if
Have a PhD (or equivalent experience) in training speech synthesis models (text-to-speech, speech-to-speech, etc.), training audio generation models, or related fields, with a track record of pushing the research frontier
Know deep learning inside out and can run the whole ML pipeline, from data wrangling and rapid prototyping to large-scale training, benchmarking, and evaluation
Love blank-page problems, chart your own course, and make progress without waiting for someone to hand you a task list
Move quickly from research breakthroughs to practical, real-world applications
Write code that’s clean enough your future self will thank you for
Play well with other brilliant minds from different domains
What you’ll be building
The first human foundation model that operates across text, speech, facial expression, and body language in real time. This unified model:
Understands fine-grained human signals — from a quirked eyebrow to a subtle change in voice — and infers meaning in context
Generates lifelike, responsive avatars whose expressions, gestures, and tone evolve frame-by-frame to deliver genuine responses
The landscape is ripe for innovation. While voice AI systems have made great strides in capturing prosody, and avatar platforms can generate compelling visuals, existing solutions remain fragmented. Real-time, multimodal interaction — where voice, facial expression, and contextual perception converge — is still an unsolved problem. This role offers the rare opportunity to shape foundational technology in a space where the boundaries are still being defined.
Why this team
We’re research scientists who’ve spent years advancing AI avatar and audio-visual generation — publishing at top conferences and shipping ultra-low-latency ML products to millions. We combine frontier research with the ruthless engineering needed for consumer-grade, real-time systems.
Fangchang Ma: CEO. MIT PhD in Robotics and ML. Previously Research Manager at Apple, with experience at DJI. Published in top AI conferences with 2400+ citations.
Edward Zhang: CTO. UW PhD in Computer Graphics. Previously Senior research scientist at Apple, with experience at Google, Microsoft.
Karren Yang: Founding Research Scientist. MIT PhD. Previously senior research scientist at Apple, with experience at Niantic Labs, Meta Reality Labs, Bosch Center for AI, and Adobe Research. Published in top AI conferences with 1800+ citations.
Claudia Vanea: Founding Research Scientist. Oxford PhD in AI. Previously founder at South Park Commons. Developed first AI computational framework in her field with performance surpassing domain experts. Published in top AI conferences and scientific journals.
Yaser Sheikh: Advisor. Former Vice President of Codec Avatars at Meta. Consulting professor at the Robotics Institute, Carnegie Mellon University.
We’re a small, fast-moving research team with an exceedingly high bar — bringing on only the very best talent. Every member has massive ownership, deep trust, and the opportunity to shape both the technology and the company from the ground up.
How to Apply
Email careers@nuancelabs.ai with your CV and a short note on why you are a strong fit.
Seniority level
Mid-Senior level
Employment type
Full-time Industries: Software Development
Referrals increase your chances of interviewing at Nuance Labs by 2x
Get notified about new Research Scientist jobs in
Seattle, WA .
#J-18808-Ljbffr