Logo
Motional

Senior Backend Engineer, Data Mining

Motional, Boston, Massachusetts, us, 02298

Save Job

Base pay range $159,000.00/yr - $207,000.00/yr

Mission Summary At Motional, we’re transforming how autonomous vehicles discover critical intelligence hidden within petabytes of multimodal sensor data. Our next-generation autonomous driving stack depends on finding the rare edge cases, long-tail scenarios, and model errors that matter most. OmniTag, our ML-powered multimodal data mining framework, is the engine that powers this discovery.

What You’ll Do

Architect the OmniTag Engine:

Design and build high-throughput, low-latency backend systems that execute billion-scale inference across Ray/Spark, transforming raw sensor data into unified multimodal representations. Optimize for both query latency and resource efficiency in a cost-sensitive, cloud-based environment.

Scale Multimodal Data Pipelines:

Own the complete data journey—from ingestion, normalization, and preprocessing of heterogeneous modalities (image, video, LiDAR, audio) through encoding, indexing, and cached embedding storage. Ensure pipelines are robust, observable, and meet the SLOs expected by downstream ML teams.

Evolve the Vector Search and Retrieval Engine:

Enhance our in-house billion-scale vector search engine to power RAG-driven few-shot dataset creation. Optimize embedding storage, retrieval performance, and filtering across billions of examples to enable rapid interactive mining workflows.

Own Data Quality and Observability:

Build comprehensive monitoring, logging, and alerting for multimodal data preprocessing pipelines. Develop data validation frameworks that catch regressions in data alignment, normalization, or encoding quality—critical for maintaining model performance.

Collaborate on Encoder-Decoder Adaptation:

Work closely with ML engineers to support domain-specific fine-tuning workflows, model versioning, and A/B testing of new encoders and decoders. Ensure the backend infrastructure enables rapid experimentation with emerging open-source multimodal foundation models.

Drive Production Reliability:

Establish patterns for graceful degradation, fault tolerance, and cost optimization. Operate OmniTag as a mission-critical data platform serving the entire ML organization, with a focus on reliability, debuggability, and operational excellence.

What We’re Looking For (Must-Haves)

BS in Computer Science or a related field, or equivalent professional experience

6+ years designing, building, and operating large-scale distributed systems in production environments

Deep, hands‑on expertise with Ray or Spark (or both) for distributed data processing and large-scale inference workloads

Expert-level Python proficiency with strong software engineering fundamentals: testing (unit, integration, and end-to-end), CI/CD pipelines, containerization, and code review practices

Proven experience optimizing and scaling production data pipelines that process terabytes or petabytes of data

Strong SQL and data manipulation skills; comfort with both structured and semi-structured data

Experience with cloud infrastructure (AWS preferred: S3, EC2, EKS, EMR, IAM) and infrastructure-as-code patterns

Demonstrated track record of shipping robust, well-tested, production-grade systems and mentoring junior engineers

Bonus Points (Nice-to-Haves)

MS/PhD in Computer Science, Machine Learning, or a related field.

Experience building or scaling vector databases, large-scale information retrieval systems, or similarity search engines.

Hands-on work with multimodal machine learning models, foundation models (LLMs/VLMs), or embeddings-based systems.

Familiarity with ML frameworks (PyTorch, JAX) and the ecosystem around multimodal models.

Production experience with workflow orchestration (Airflow, Kubeflow, Dagster) and stream processing (Kafka, Flink).

Understanding of model serving patterns, feature stores, or ML ops infrastructure.

Domain knowledge in autonomous driving, computer vision, or sensor fusion.

Experience with ML-based data mining, active learning, or contrastive learning approaches.

Location and Schedule We encourage a hybrid schedule with in-office time at one of our locations in Boston, Pittsburgh, or Las Vegas to support collaboration, or this role can be fully remote.

Salary Salary range: $159,000 USD - $207,000 USD. The estimate reflects base salary only. This role may include additional forms of compensation such as a bonus or company equity.

Company Overview Motional is a driverless technology company making autonomous vehicles a safe, reliable, and accessible reality. We’re driven by something more. We aren’t just developing driverless cars; we’re creating safer roadways, more equitable transportation options, and making our communities better places to live, work, and connect. Our team is made up of engineers, researchers, innovators, dreamers and doers, who are creating a technology with the potential to transform the way we move.

We’re creating first-of-its-kind technology that will transform transportation. To do so successfully, we must design for everyone in our cities and on our roads. We believe in building a great place to work through a progressive, global culture that is diverse, inclusive, and ensures people feel valued at every level of the organization.

EEO Statement Motional AD Inc. is an EOE. We celebrate diversity and are committed to creating an inclusive environment for all employees. To comply with Federal Law, we participate in E-Verify. All newly-hired employees are queried through this electronic system established by the DHS and the SSA to verify their identity and employment eligibility.

#J-18808-Ljbffr