Logo
Harrison Clarke

Senior AI Engineer (Sunnyvale)

Harrison Clarke, Sunnyvale, California, United States, 94087

Save Job

Senior AI Engineer - Video Search (Applied Research & Product) Remote - United States

About the Company Were partnering with a U.S-based applied AI company building next-generation real-time video understanding systems deployed at scale across enterprise, government, and public environments. The platform combines cutting-edge multimodal AI, vector search, and high-performance inference pipelines to make visual data searchable, interpretable, and actionable in real time.

This is a chance to join a well-funded, mission-driven organization with over tens of thousands of active camera streams and a rapidly growing R&D team pushing the boundaries of multimodal retrieval and AI systems design.

The Role Were looking for a Senior AI Engineer to lead the applied research and productionization of our video search and retrieval stack - connecting natural-language queries to high-dimensional video representations with real-time performance.

Youll work at the intersection of deep learning research, scalable systems, and GPU-optimized inference, owning models and pipelines end-to-end from training through deployment.

What Youll Do Design and build natural-language-to-video retrieval systems using state-of-the-art architectures (e.g., V-JEPA, CLIP, SigLIP, Video-LLMs, ViViT, TimeSformer). Develop temporal localization and video summarization capabilities with fine-grained moment-level embeddings. Stand up vector search infrastructure (FAISS, Milvus, pgvector, Pinecone) with optimized sharding, caching, and hybrid retrieval strategies. Optimize GPU inference and serving pipelines using ONNX Runtime, TensorRT, or ROCm for low-latency performance. Drive multi-GPU training and distributed serving (FSDP, ZeRO, DDP, NCCL/RCCL) with strong understanding of parallelization and quantization techniques. Collaborate with MLOps, backend, and product teams to deliver production-ready AI features at scale. Define and track key retrieval and relevance metrics (R@K, mAP, nDCG) and run live A/B evaluations. Mentor junior engineers, document design decisions, and drive innovation through rigorous experimentation.

What Were Looking For 6-10+ years of experience in machine learning or applied AI, with 4+ years focused on video understanding, multimodal retrieval, or transformer-based models. Proficiency in PyTorch and deep learning frameworks; experience with video backbones, contrastive training, and representation learning. Strong understanding of vector databases, ANN search (HNSW, IVF), and embedding pipelines. Demonstrated ability to ship high-performance AI systems with GPU optimization, ONNX/TensorRT, or ROCm pipelines. Experience with distributed training, CI/CD for ML, and scalable data pipelines (MLflow, W&B, K8s, Docker). Excellent communication skills and a collaborative, low-ego approach to problem solving.

Nice-to-Haves Experience with temporal detection, video tracking, or re-ID. Exposure to Video-RAG or structured retrieval (metadata + knowledge graph). Background in real-time or edge inference systems. Interest in privacy-preserving or regulated AI systems.

Compensation & Logistics Compensation:

Competitive base salary + bonus + equity Location:

Fully remote (U.S. based)

Why Join Build real-world AI that operates at scale and latency levels few companies ever reach. Collaborate with world-class engineers and researchers in a fast-paced, mission-oriented environment. Work on deep technical challenges - multimodal search, retrieval, inference optimization - with real-world outcomes.