Pear VC
About NomadicML
Americans drive over 5 trillion miles a year, more than 500 billion of them recorded. Buried in that footage is the next frontier of machine intelligence. At NomadicML, we’re building the platform that unlocks it.
Our Vision-Language Models (VLMs) act as the new “hydraulic mining” for video, transforming raw footage into structured intelligence that powers real-world autonomy and robotics. We partner with industry leaders across self-driving, robotics, and industrial automation to mine insights from petabytes of data that were once unusable.
NomadicML was founded by
Mustafa Bal
and
Varun Krishnan , who met at
Harvard University
while studying Computer Science.
Mustafa
is a core contributor to
ONNX Runtime
and
DeepSpeed
with deep expertise in distributed systems and large-scale model training infrastructure
Varun
is an INFORMS Wagner Prize Finalist for his research in large-scale driver navigation AI models and one of the top chess players in the US.
Our team has built mission-critical AI systems at
Snowflake, Lyft, Microsoft, Amazon, and IBM Research , holds top-tier publications in VLMS and AI at conferences like
CVPR , and moves with the speed and clarity of a startup obsessed with impact.
About the Role We’re looking for a
Backend / Infrastructure Engineer
who thrives at the intersection of
cloud systems, SDK design, and large-scale inference infrastructure .
You’ll build and scale the backbone that powers NomadicML’s video intelligence platform — from secure cloud ingestion to distributed GPU inference pipelines that run our largest foundation models. You’ll collaborate with ML researchers to productionize their models, automate deployment and scaling, and expose those capabilities through clean APIs and SDKs used by enterprises worldwide.
This role blends
systems engineering ,
distributed compute orchestration , and
developer experience . You’ll be working across cloud storage, inference scheduling, GPU clusters, and the NomadicML SDK.
What You’ll Build
GPU Inference Workflows:
Architect pipelines to run massive multi-GPU inference jobs on foundation-scale video models, optimizing for throughput, cost, and reliability.
Cloud Upload Infrastructure:
Build direct integrations with AWS S3, GCP Storage, and Azure Blobs to support large-scale ingest via signed URLs and resumable uploads.
Distributed Processing Pipelines:
Design event-driven, autoscaling job systems using Kubernetes, Pub/Sub, or Ray for analyzing terabytes of video data in parallel.
Developer SDKs and APIs:
Power the NomadicML Python SDK used for programmatic video ingestion, analysis, and search — the core tool researchers and customers rely on.
End-to-End Observability:
Build logging, tracing, and metrics pipelines that surface GPU utilization, job latency, and per-video inference health.
Lightweight Frontend Integrations:
Support the web app’s Cloud Integrations and Project Workflows through backend endpoints and TypeScript SDK bindings.
You Might Be a Fit If You Have
Deep proficiency in
Python ,
Go , or
TypeScript
for backend systems.
Experience with
AWS, GCP, or Azure
(IAM, S3/Blob Storage, Batch/Compute APIs, etc.).
Strong understanding of
GPU inference scaling ,
Kubernetes ,
container orchestration , and
event-driven pipelines .
Prior experience designing
REST/gRPC APIs , SDKs, or developer-facing infrastructure.
Familiarity with
asynchronous job orchestration
(Ray, Airflow, Dagster, Temporal).
A practical mindset: you take research-grade systems and make them reliable, fast, and usable.
Nice to Have
Experience contributing to
inference orchestration frameworks
or
ML infra tools
(e.g., DeepSpeed, Triton, Ray Serve).
Understanding of
video encoding, chunking, and streaming formats
for efficient multi-modal ingestion.
Basic front-end experience (React / Next.js) for integrating backend pipelines into product workflows.
Background in
ML infrastructure ,
observability , or
data management systems .
#J-18808-Ljbffr
Our Vision-Language Models (VLMs) act as the new “hydraulic mining” for video, transforming raw footage into structured intelligence that powers real-world autonomy and robotics. We partner with industry leaders across self-driving, robotics, and industrial automation to mine insights from petabytes of data that were once unusable.
NomadicML was founded by
Mustafa Bal
and
Varun Krishnan , who met at
Harvard University
while studying Computer Science.
Mustafa
is a core contributor to
ONNX Runtime
and
DeepSpeed
with deep expertise in distributed systems and large-scale model training infrastructure
Varun
is an INFORMS Wagner Prize Finalist for his research in large-scale driver navigation AI models and one of the top chess players in the US.
Our team has built mission-critical AI systems at
Snowflake, Lyft, Microsoft, Amazon, and IBM Research , holds top-tier publications in VLMS and AI at conferences like
CVPR , and moves with the speed and clarity of a startup obsessed with impact.
About the Role We’re looking for a
Backend / Infrastructure Engineer
who thrives at the intersection of
cloud systems, SDK design, and large-scale inference infrastructure .
You’ll build and scale the backbone that powers NomadicML’s video intelligence platform — from secure cloud ingestion to distributed GPU inference pipelines that run our largest foundation models. You’ll collaborate with ML researchers to productionize their models, automate deployment and scaling, and expose those capabilities through clean APIs and SDKs used by enterprises worldwide.
This role blends
systems engineering ,
distributed compute orchestration , and
developer experience . You’ll be working across cloud storage, inference scheduling, GPU clusters, and the NomadicML SDK.
What You’ll Build
GPU Inference Workflows:
Architect pipelines to run massive multi-GPU inference jobs on foundation-scale video models, optimizing for throughput, cost, and reliability.
Cloud Upload Infrastructure:
Build direct integrations with AWS S3, GCP Storage, and Azure Blobs to support large-scale ingest via signed URLs and resumable uploads.
Distributed Processing Pipelines:
Design event-driven, autoscaling job systems using Kubernetes, Pub/Sub, or Ray for analyzing terabytes of video data in parallel.
Developer SDKs and APIs:
Power the NomadicML Python SDK used for programmatic video ingestion, analysis, and search — the core tool researchers and customers rely on.
End-to-End Observability:
Build logging, tracing, and metrics pipelines that surface GPU utilization, job latency, and per-video inference health.
Lightweight Frontend Integrations:
Support the web app’s Cloud Integrations and Project Workflows through backend endpoints and TypeScript SDK bindings.
You Might Be a Fit If You Have
Deep proficiency in
Python ,
Go , or
TypeScript
for backend systems.
Experience with
AWS, GCP, or Azure
(IAM, S3/Blob Storage, Batch/Compute APIs, etc.).
Strong understanding of
GPU inference scaling ,
Kubernetes ,
container orchestration , and
event-driven pipelines .
Prior experience designing
REST/gRPC APIs , SDKs, or developer-facing infrastructure.
Familiarity with
asynchronous job orchestration
(Ray, Airflow, Dagster, Temporal).
A practical mindset: you take research-grade systems and make them reliable, fast, and usable.
Nice to Have
Experience contributing to
inference orchestration frameworks
or
ML infra tools
(e.g., DeepSpeed, Triton, Ray Serve).
Understanding of
video encoding, chunking, and streaming formats
for efficient multi-modal ingestion.
Basic front-end experience (React / Next.js) for integrating backend pipelines into product workflows.
Background in
ML infrastructure ,
observability , or
data management systems .
#J-18808-Ljbffr