Simarn Solutions

Senior AI/ML Data Engineer

Simarn Solutions, San Francisco, California, United States, 94199

About the job Senior AI/ML Data Engineer

Job Title: Senior AI/ML Data Engineer

Location: California (Onsite)

Job Type: C2C

Position Overview

We are seeking a highly skilled Senior AI/ML Data Engineer with expertise in multimodal AI models, vector databases, and Azure-based data engineering solutions. The candidate will lead design and implementation of scalable AI-driven data platforms, integrating advanced ML pipelines with modern cloud architectures.

This is a client-facing, onsite role in California, requiring both technical depth and leadership ability to deliver enterprise-grade AI/data solutions.

Key Responsibilities

Design, build, and optimize AI/ML pipelines using multimodal models (CLIP, BLIP, Whisper, or similar) Implement vector database solutions (FAISS, Milvus, Weaviate) and embedding pipelines for retrieval-augmented systems. Develop and maintain data ingestion, ETL, and streaming solutions using Microsoft Fabric, Azure Data Factory, Event Hub, and Kafka. Architect, implement, and optimize Azure-based solutions, including Azure Synapse, Databricks, and Azure SQL. Write efficient SQL, Python, and Spark code for data transformation, ML feature engineering, and analytics. Use infrastructure-as-code (Terraform, ARM templates) for automated deployment and environment consistency. Collaborate with cross-functional teams (data science, cloud, and business stakeholders) to translate requirements into scalable solutions. Ensure data governance, compliance, and security best practices are applied across all platforms. Required Skills & Qualifications

Hands-on experience with AI/ML multimodal models (CLIP, BLIP, Whisper, or similar). Strong proficiency in Python for AI/ML and automation workflows. Experience with vector databases (FAISS, Milvus, Weaviate) and embedding pipelines. Proficiency in Microsoft Azure services: Azure Data Factory Azure Synapse Azure Databricks Azure SQL Event Hub

Experience in data streaming platforms (Azure Event Hubs, Kafka). Strong skills in SQL and Spark for large-scale data engineering. Experience with infrastructure-as-code (Terraform, ARM templates). Strong problem-solving, debugging, and performance optimization skills. Excellent communication skills with ability to mentor junior engineers and engage with stakeholders. Nice to Have

Experience with Oracle PL/SQL and hybrid database ecosystems. Knowledge of data warehousing, data modeling, and governance frameworks. Exposure to MLOps frameworks and enterprise-grade AI deployment.