MindDoc
ML Data Engineer (m,f,d) – MindDoc
MindDoc has been Germany's leading provider of video-based psychotherapy since 2017. In addition to psychotherapy, we operate a mental health app downloaded by over three million people worldwide. We make help and support easily accessible anytime, anywhere for everyone.
As a ML Data Engineer at MindDoc you will:
Set up and operate the audio‑to‑text pipeline (evaluating ASR models – Whisper, AssemblyAI, specialized providers; speaker diarization for therapist/patient separation).
Automate the processing chain: ingestion → transcription → quality check → storage.
Integrate and support annotation tools (Label Studio, Prodigy, or similar).
Design database and storage architecture for transcripts and metadata.
Monitor data quality and error handling.
Interface with the vAI platform for RAG data delivery.
Have 2–4 years of experience building data pipelines (Python, ideally also Go).
Possess experience in audio/speech processing (advantage).
Have solid knowledge of SQL, NoSQL, and cloud infrastructure (AWS/GCP/Azure).
Work pragmatically, hands‑on, and build solutions independently.
Be fluent in English; German is a plus.
Nice‑to‑have: experience with MLOps tools (Airflow, Prefect, DVC).
You can expect
A meaningful job: digitizing mental health and helping thousands of people daily.
A balanced environment – a digital‑health startup with flat hierarchies and a stable, family‑owned hospital group as our 100% shareholder.
30 days vacation.
Remote work with optional office in Berlin or Munich.
Sponsored lunch.
VIP treatment in Schön Klinik’s portfolio, pension plan, and child care.
Company bike lease options.
Employee discounts at more than 600 brands.
EGYM Wellpass.
Seniority level Associate
Employment type Full-time
Job function Quality Assurance and Health Care Provider
Industries Mental Health Care
Referrals increase your chances of interviewing at MindDoc by 2×
Location: Cologne, North Rhine-Westphalia, Germany
#J-18808-Ljbffr
As a ML Data Engineer at MindDoc you will:
Set up and operate the audio‑to‑text pipeline (evaluating ASR models – Whisper, AssemblyAI, specialized providers; speaker diarization for therapist/patient separation).
Automate the processing chain: ingestion → transcription → quality check → storage.
Integrate and support annotation tools (Label Studio, Prodigy, or similar).
Design database and storage architecture for transcripts and metadata.
Monitor data quality and error handling.
Interface with the vAI platform for RAG data delivery.
Have 2–4 years of experience building data pipelines (Python, ideally also Go).
Possess experience in audio/speech processing (advantage).
Have solid knowledge of SQL, NoSQL, and cloud infrastructure (AWS/GCP/Azure).
Work pragmatically, hands‑on, and build solutions independently.
Be fluent in English; German is a plus.
Nice‑to‑have: experience with MLOps tools (Airflow, Prefect, DVC).
You can expect
A meaningful job: digitizing mental health and helping thousands of people daily.
A balanced environment – a digital‑health startup with flat hierarchies and a stable, family‑owned hospital group as our 100% shareholder.
30 days vacation.
Remote work with optional office in Berlin or Munich.
Sponsored lunch.
VIP treatment in Schön Klinik’s portfolio, pension plan, and child care.
Company bike lease options.
Employee discounts at more than 600 brands.
EGYM Wellpass.
Seniority level Associate
Employment type Full-time
Job function Quality Assurance and Health Care Provider
Industries Mental Health Care
Referrals increase your chances of interviewing at MindDoc by 2×
Location: Cologne, North Rhine-Westphalia, Germany
#J-18808-Ljbffr