ndimensions labs

Vision/SLAM Engineer — Embodied Robotics

ndimensions labs, Boston, Massachusetts, us, 02298

Pay Range

NDimensions labs provided pay range:

This range is provided by ndimensions labs. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range:

$150,000.00/yr - $300,000.00/yr

About Us We're a team of technologists from MIT, the University of Waterloo, and the University of Washington who know how to turn deep tech into products through multiple successful ventures.

About the Role We’re looking for a Vision/SLAM-focused Robotics Software Engineer to push the boundaries of robot perception and mapping—combining state-of-the-art visual learning with proven SLAM and sensor fusion.

You will architect and ship the perception–mapping stack that powers intelligent behavior: visual/VIO pipelines with robust tracking, loop closure, and relocalization; semantic, multi-layer maps that planners and policies can act on; and world models that bridge simulation and real deployment. Expect to fuse RGB/RGB-D/LiDAR/IMU under strict latency/compute budgets, adapt modern ViTs/VLMs for 2D–3D understanding, and deliver mapping-aware control that closes the loop from pixels to policies.

What You’ll Do

Own the mapping stack : design and ship visual SLAM pipelines (front‑end + back‑end) with robust tracking, loop closure, and relocalization under tight latency/compute budgets.

Build semantic maps : fuse geometry with semantics into multi‑layer maps usable by planners and policies.

World models : develop learned predictive/latent‑state models that capture scene dynamics and uncertainty; integrate them with control and task policies.

Multi‑sensor fusion : calibrate and fuse RGB/RGB‑D/LiDAR/IMU/wheel odometry; handle time sync, extrinsics, and degraded sensing.

Representation learning : adapt ViTs/VLMs for segmentation, detection, tracking, place recognition, and 3D understanding; learn scene graphs and object‑centric representations.

Advance the stack : explore beyond current VLAs (OpenVLA/RT‑2/RT‑X), adapt ViTs (DINO, SAM), VLMs (CLIP, BLIP‑2, LLaVA), and diffusion planners (UniPi, Diffusion Policy) for mapping‑aware control.

What We’re Looking For

SLAM expertise : visual/VIO/VSLAM experience (feature‑ or direct‑based), bundle adjustment, factor graphs, pose‑graph optimization, loop closure, place recognition, robust estimation.

Semantic mapping : panoptic/instance segmentation, 2D‑to‑3D lifting, multi‑layer map fusion, uncertainty modeling, lifelong/incremental mapping.

World‑modeling : learned state‑space models, dynamics prediction.

Strong CV & multimodal background : transformer‑based models, self‑supervised learning, tracking, foundation model adaptation for robotics.

Engineering : C++ and Python; CUDA/TensorRT a plus; ROS2; strong profiling/latency discipline; productionizing perception systems on robots.

Data : curation/augmentation for robotics; evaluation protocols.

Sim + real : Isaac/MuJoCo/Habitat and on‑robot bring‑up; optimization libs (Ceres, GTSAM), geometric libs (OpenCV, Open3D).

Bonus

Differentiable SLAM or neural fields (NeRF/3DGS) integrated with classical stacks.

Active perception, task‑driven exploration, or belief‑space planning.

Publications at top venues (CVPR/ICCV/ECCV/CoRL/RSS/ICRA/IROS).

Experience with large‑scale multi‑robot mapping or map compression/streaming.

We are an equal opportunity employer and welcome applicants from all backgrounds.

If you’d like to share more about your work—such as papers, repos or demos—feel free to send your CV and links to

careers@ndimensions.xyz .

#J-18808-Ljbffr