Pear VC
Member of Technical Staff, Machine Learning
Company: NomadicML
Overview Americans drive over 5 trillion miles a year, more than 500 billion of them recorded. Buried in that footage is the next frontier of machine intelligence. At NomadicML, we’re building the platform that unlocks it.
Our Vision‑Language Models (VLMs) act as the new “hydraulic mining” for video, transforming raw footage into structured intelligence that powers real‑world autonomy and robotics. We partner with industry leaders across self‑driving, robotics, and industrial automation to mine insights from petabytes of data that were once unusable.
NomadicML was founded by
Mustafa Bal
and
Varun Krishnan , who met at
Harvard University
while studying Computer Science.
Our team has built mission‑critical AI systems at
Snowflake, Lyft, Microsoft, Amazon, and IBM Research , holds top‑tier publications in VLMS and AI at conferences like
CVPR , and moves with the speed and clarity of a startup obsessed with impact.
The Role We’re seeking a
Machine Learning Engineer
who thrives at the frontier of foundation‑model research and production engineering.
You’ll help define how machines learn from motion: training and fine‑tuning large‑scale
Vision‑Language Models
to reason about complex, real‑world video.
Your work will involve building multi‑modal architectures that perceive, localize, and describe motion events (turns, lane changes, interactions, anomalies) across millions of frames, and turning those breakthroughs into robust APIs and SDKs used by enterprise customers.
Responsibilities
Train and evaluate VLMs specialized for motion understanding in autonomous‑driving and robotics datasets.
Design and scale GPU‑accelerated pipelines for training, fine‑tuning, and inference on multi‑modal data (video + language + sensor metadata).
Build agentic evaluation frameworks that benchmark spatiotemporal reasoning, localization accuracy, and narrative consistency.
Develop and productionize curation loops that use our own models to generate and refine datasets (“AI training AI”).
Publish high‑impact research (e.g., NeurIPS, CVPR) while shipping features that customers use immediately.
Qualifications
Strong proficiency in Python, PyTorch, and large‑scale ML workflows.
Research experience in foundation models, VLMs, or multi‑modal learning (publications/patents a plus).
Ability to iterate quickly and autonomously, running experiments end‑to‑end.
Experience training or fine‑tuning models on video or sensor data.
Understanding of retrieval systems, embeddings, and GPU optimization.
Nice to Have
Contributions to open‑source ML frameworks (e.g., DeepSpeed, Hugging Face).
Experience with vector databases, distributed training, or ML orchestration systems (e.g., Ray, Kubeflow, MLflow).
Prior exposure to autonomous‑driving or robotics datasets.
Job Details
Seniority level: Mid‑Senior level
Employment type: Full‑time
Job function: Engineering and Information Technology
Industries: Venture Capital and Private Equity Principals
#J-18808-Ljbffr
Overview Americans drive over 5 trillion miles a year, more than 500 billion of them recorded. Buried in that footage is the next frontier of machine intelligence. At NomadicML, we’re building the platform that unlocks it.
Our Vision‑Language Models (VLMs) act as the new “hydraulic mining” for video, transforming raw footage into structured intelligence that powers real‑world autonomy and robotics. We partner with industry leaders across self‑driving, robotics, and industrial automation to mine insights from petabytes of data that were once unusable.
NomadicML was founded by
Mustafa Bal
and
Varun Krishnan , who met at
Harvard University
while studying Computer Science.
Our team has built mission‑critical AI systems at
Snowflake, Lyft, Microsoft, Amazon, and IBM Research , holds top‑tier publications in VLMS and AI at conferences like
CVPR , and moves with the speed and clarity of a startup obsessed with impact.
The Role We’re seeking a
Machine Learning Engineer
who thrives at the frontier of foundation‑model research and production engineering.
You’ll help define how machines learn from motion: training and fine‑tuning large‑scale
Vision‑Language Models
to reason about complex, real‑world video.
Your work will involve building multi‑modal architectures that perceive, localize, and describe motion events (turns, lane changes, interactions, anomalies) across millions of frames, and turning those breakthroughs into robust APIs and SDKs used by enterprise customers.
Responsibilities
Train and evaluate VLMs specialized for motion understanding in autonomous‑driving and robotics datasets.
Design and scale GPU‑accelerated pipelines for training, fine‑tuning, and inference on multi‑modal data (video + language + sensor metadata).
Build agentic evaluation frameworks that benchmark spatiotemporal reasoning, localization accuracy, and narrative consistency.
Develop and productionize curation loops that use our own models to generate and refine datasets (“AI training AI”).
Publish high‑impact research (e.g., NeurIPS, CVPR) while shipping features that customers use immediately.
Qualifications
Strong proficiency in Python, PyTorch, and large‑scale ML workflows.
Research experience in foundation models, VLMs, or multi‑modal learning (publications/patents a plus).
Ability to iterate quickly and autonomously, running experiments end‑to‑end.
Experience training or fine‑tuning models on video or sensor data.
Understanding of retrieval systems, embeddings, and GPU optimization.
Nice to Have
Contributions to open‑source ML frameworks (e.g., DeepSpeed, Hugging Face).
Experience with vector databases, distributed training, or ML orchestration systems (e.g., Ray, Kubeflow, MLflow).
Prior exposure to autonomous‑driving or robotics datasets.
Job Details
Seniority level: Mid‑Senior level
Employment type: Full‑time
Job function: Engineering and Information Technology
Industries: Venture Capital and Private Equity Principals
#J-18808-Ljbffr