Liquid AI
Member of Technical Staff - ML Research Engineer, Foundation Model Data
Liquid AI, San Francisco, California, United States, 94199
Member of Technical Staff - ML Research Engineer, Foundation Model Data
Join to apply for the
Member of Technical Staff - ML Research Engineer, Foundation Model Data
role at
Liquid AI
At Liquid, we’re not just building AI models—we’re redefining the architecture of intelligence itself. Spun out of MIT, our mission is to build efficient AI systems at every scale. Our Liquid Foundation Models (LFMs) operate where others can’t: on-device, at the edge, under real‑time constraints. We’re not iterating on old ideas—we’re architecting what comes next.
We believe great talent powers great technology. The Liquid team is a community of world‑class engineers, researchers, and builders creating the next generation of AI. Whether you’re helping shape model architectures, scaling our dev platforms, or enabling enterprise deployments—your work will directly shape the frontier of intelligent systems.
This Role Is For You If
You want to play a critical role in our foundation model development process, focusing on consolidating, gathering, and generating high‑quality text data for pretraining, mid‑training, SFT, and preference optimisation.
Required Experience
Experience Level: B.S. + 5 years experience or M.S. + 3 years experience or Ph.D. + 1 year of experience.
Dataset Engineering: Expertise in data curation, cleaning, augmentation, and synthetic data generation techniques.
Machine Learning: Ability to write and debug models in popular ML frameworks, and experience working with LLMs.
Software Development: Strong programming skills in Python, with an emphasis on writing clean, maintainable, and scalable code.
Desired Experience
M.S. or Ph.D. in Computer Science, Electrical Engineering, Math, or a related field.
Experience fine‑tuning or customising LLMs.
First‑author publications in top ML conferences (e.g. NeurIPS, ICML, ICLR).
Contributions to popular open‑source projects.
What You’ll Actually Do
Create and maintain data cleaning, filtering, selection pipelines that can handle >100 TB of data.
Watch for the release of public datasets on HuggingFace and other platforms.
Create crawlers to gather datasets from the web where public data is lacking.
Write and maintain synthetic data generation pipelines.
Run ablations to assess new dataset and judgement pipelines.
What You’ll Gain
Hands‑on experience with state‑of‑the‑art technology at a leading AI company.
A collaborative, fast‑paced environment where your work directly shapes our products and the next generation of LFMs.
About Liquid AI Spun out of MIT CSAIL, we’re a foundation model company headquartered in Boston. Our mission is to build capable and efficient general‑purpose AI systems at every scale—from phones and vehicles to enterprise servers and embedded chips. Our models are designed to run where others stall: on CPUs, with low latency, minimal memory, and maximum reliability. We’re already partnering with global enterprises across consumer electronics, automotive, life sciences, and financial services. And we’re just getting started.
Seniority level Mid‑Senior level
Employment type Full‑time
Job function Engineering and Information Technology
Industries Information Services
#J-18808-Ljbffr
Member of Technical Staff - ML Research Engineer, Foundation Model Data
role at
Liquid AI
At Liquid, we’re not just building AI models—we’re redefining the architecture of intelligence itself. Spun out of MIT, our mission is to build efficient AI systems at every scale. Our Liquid Foundation Models (LFMs) operate where others can’t: on-device, at the edge, under real‑time constraints. We’re not iterating on old ideas—we’re architecting what comes next.
We believe great talent powers great technology. The Liquid team is a community of world‑class engineers, researchers, and builders creating the next generation of AI. Whether you’re helping shape model architectures, scaling our dev platforms, or enabling enterprise deployments—your work will directly shape the frontier of intelligent systems.
This Role Is For You If
You want to play a critical role in our foundation model development process, focusing on consolidating, gathering, and generating high‑quality text data for pretraining, mid‑training, SFT, and preference optimisation.
Required Experience
Experience Level: B.S. + 5 years experience or M.S. + 3 years experience or Ph.D. + 1 year of experience.
Dataset Engineering: Expertise in data curation, cleaning, augmentation, and synthetic data generation techniques.
Machine Learning: Ability to write and debug models in popular ML frameworks, and experience working with LLMs.
Software Development: Strong programming skills in Python, with an emphasis on writing clean, maintainable, and scalable code.
Desired Experience
M.S. or Ph.D. in Computer Science, Electrical Engineering, Math, or a related field.
Experience fine‑tuning or customising LLMs.
First‑author publications in top ML conferences (e.g. NeurIPS, ICML, ICLR).
Contributions to popular open‑source projects.
What You’ll Actually Do
Create and maintain data cleaning, filtering, selection pipelines that can handle >100 TB of data.
Watch for the release of public datasets on HuggingFace and other platforms.
Create crawlers to gather datasets from the web where public data is lacking.
Write and maintain synthetic data generation pipelines.
Run ablations to assess new dataset and judgement pipelines.
What You’ll Gain
Hands‑on experience with state‑of‑the‑art technology at a leading AI company.
A collaborative, fast‑paced environment where your work directly shapes our products and the next generation of LFMs.
About Liquid AI Spun out of MIT CSAIL, we’re a foundation model company headquartered in Boston. Our mission is to build capable and efficient general‑purpose AI systems at every scale—from phones and vehicles to enterprise servers and embedded chips. Our models are designed to run where others stall: on CPUs, with low latency, minimal memory, and maximum reliability. We’re already partnering with global enterprises across consumer electronics, automotive, life sciences, and financial services. And we’re just getting started.
Seniority level Mid‑Senior level
Employment type Full‑time
Job function Engineering and Information Technology
Industries Information Services
#J-18808-Ljbffr