ZipRecruiter

Data Scientist

ZipRecruiter, Northbrook, Illinois, us, 60065

Job DescriptionJob DescriptionDescription:

Homethrive Overview

Homethrive, Inc. is a technology-enabled healthcare services firm revolutionizing family caregiving and aging in place. Homethrive provides a unique integration of high touch and high tech to help family caregivers support their loved ones, improve the ability of older adults to remain at home, and reduces the total cost of care.

We are led by seasoned industry veterans with prior experience building and running multi-billion-dollar businesses and very well-funded by leading healthcare venture capital firms including 7Wire Ventures, Human Capital and Pitango. We operate with urgency, collaboration, and a continuous improvement mindset—always in service to our members.

General Overview

Homethrive is growing, and we are looking for a visionary Data Scientist to help build the next of our AI-powered products. This role is for a hands-on expert who will be responsible for designing, building, and integrating our core knowledge systems to drive more proactive and personalized support for our members. We are seeking someone with deep experience in building sophisticated AI systems from the ground up, with a special focus on

Knowledge Graphs and Retrieval Augmented (RAG) systems

to serve as the intelligent foundation for our AI product.

The technology we currently utilize includes:

Python

Snowflake

AWS RDS (MySQL), MongoDB Atlas

Salesforce CRM, Tableau

Cloud hosting on AWS using a mixture of Lambda, Glue, DynamoDB, and S3

Requirements:

Key Responsibilities:

Knowledge Graph & Graph RAG System Development:

Design and implement graph data models and schemas (ontologies) to represent complex relationships within the caregiving domain (e.g., members, conditions, risks, interventions).

Build and maintain data ingestion pipelines to populate the knowledge graph from structured and unstructured sources.

Develop, fine-tune, and deploy end-to-end Graph RAG systems that leverage our knowledge graph to provide contextually rich, accurate, and explainable responses for our AI assistant.

Collaborate with product and engineering teams to seamlessly integrate the Graph RAG system into our AI product, ensuring it meets performance and scalability requirements.

Data Collection and Preprocessing:

Acquire, clean, and transform structured and unstructured data from various sources to feed both traditional ML models and the knowledge graph.

Identify and address data quality issues, missing values, and outliers.

Exploratory Data Analysis and Modeling:

Perform exploratory data analysis to identify patterns, trends, and relationships within data sets.

Design and implement statistical and machine learning models to solve complex business problems, complementing the Graph RAG system.

Evaluate and optimize model performance using appropriate techniques and metrics.

Data Visualization and Storytelling:

Create compelling data visualizations and dashboards to effectively communicate findings and insights.

Present complex analytical results and the value of our graph-based AI to both technical and non-technical audiences.

Qualifications

Education: BS/MS in Computer Science, Data Science, another related discipline or equivalent experience.

3+ years of related professional experience required.

Strong proficiency in programming such as Python and SQL.

Proven, hands-on experience designing, building, and deploying Retrieval Augmented (RAG) systems in a production environment.

Deep expertise in graph database technologies (e.g., Neo4j, Neptune, TigerGraph) and graph data modeling/ontology design.

Strong background in Natural Processing (NLP), Natural Understanding (NLU), and vector databases (e.g., Pinecone, Weaviate).

Experience with Large Models (LLMs) and state-of-the-art NLP libraries and frameworks (e.g., Hugging Face, Transformers, LangChain, LlamaIndex).

Expertise in machine learning algorithms, statistical modeling, and data mining techniques.

Experience with data visualization tools (e.g., Tableau, Power BI).

Familiarity with cloud computing platforms (e.g., AWS, GCP, Azure) and big data technologies.

Strong knowledge of Data Warehousing (Snowflake) and Data Modeling techniques.

A successful history of integrating source systems and building data pipelines, especially for complex AI systems.

Self-directed and comfortable supporting the data needs of cross-functional teams, systems, and products in a high-pressure environment.

Passion for learning & results-oriented.

EEO

Homethrive provides equal employment opportunities to all employees and applicants without regard to , , , (including stereotyping), , ancestry, citizenship status, (which includes , childbirth, and medical conditions related to , childbirth, or breastfeeding), physical , mental , , military status or status as a Vietnam-era or special disabled veteran, marital status, registered domestic partner status, , , expression, medical condition (including, but not limited to, cancer-related or HIV/AIDS-related), genetic information, , or any other status protected by applicable federal, state, and local laws.