Fabrion
ML/AI Research Engineer — Agentic AI Lab (Founding Team)
Fabrion, San Francisco, California, United States, 94199
OverviewLocation: San Francisco Bay Area Type: Full-Time Compensation: Competitive salary + meaningful equity (founding tier)Backed by 8VC, we\'re building a world-class team to tackle one of the industry’s most critical infrastructure problems.About The RoleWe’re designing the future of enterprise AI infrastructure — grounded in agents, retrieval-augmented generation (RAG), knowledge graphs, and multi-tenant governance.We’re looking for an ML/AI Research Engineer to join our AI Lab and lead the design, training, evaluation, and optimization of agent-native AI models. You\'ll work at the intersection of LLMs, vector search, graph reasoning, and reinforcement learning — building the intelligence layer that sits on top of our enterprise data fabric.This isn’t a prompt engineer role. It’s full-cycle ML: from data curation and fine-tuning to evaluation, interpretability, and deployment — with cost-awareness, alignment, and agent coordination all in scope.Core ResponsibilitiesFine-tune and evaluate open-source LLMs (e.g. LLaMA 3, Mistral, Falcon, Mixtral) for enterprise use cases with both structured and unstructured dataBuild and optimize RAG pipelines using LangChain, LangGraph, LlamaIndex, or Dust — integrated with our vector DBs and internal knowledge graphTrain agent architectures (ReAct, AutoGPT, BabyAGI, OpenAgents) using enterprise task dataDevelop embedding-based memory and retrieval chains with token-efficient chunking strategiesCreate reinforcement learning pipelines to optimize agent behaviors (e.g. RLHF, DPO, PPO)Establish scalable evaluation harnesses for LLM and agent performance, including synthetic evals, trace capture, and explainability toolsContribute to model observability, drift detection, error classification, and alignmentOptimize inference latency and GPU resource utilization across cloud and on-prem environmentsDesired ExperienceModel TrainingDeep experience fine-tuning open-source LLMs using HuggingFace Transformers, DeepSpeed, vLLM, FSDP, LoRA/QLoRAWorked with both base and instruction-tuned models; familiar with SFT, RLHF, DPO pipelinesComfortable building and maintaining custom training datasets, filters, and eval splitsUnderstand tradeoffs in batch size, token window, optimizer, precision (FP16, bfloat16), and quantizationRAG + Knowledge GraphsExperience building enterprise-grade RAG pipelines integrated with real-time or contextual dataFamiliar with LangChain, LangGraph, LlamaIndex, and open-source vector DBs (Weaviate, Qdrant, FAISS)Experience grounding models with structured data (SQL, graph, metadata) + unstructured sourcesBonus: Worked with Neo4j, Puppygraph, RDF, OWL, or other semantic modeling systemsAgent IntelligenceExperience training or customizing agent frameworks with multi-step reasoning and memoryUnderstand common agent loop patterns (e.g. Plan→Act→Reflect), memory recall, and toolsFamiliar with self-correction, multi-agent communication, and agent ops loggingOptimizationStrong background in token cost optimization, chunking strategies, reranking (e.g. Cohere, Jina), compression, and retrieval latency tuningExperience running models under quantized (int4/int8) or multi-GPU settings with inference tuning (vLLM, TGI)Preferred Tech StackLLM Training & Inference: HuggingFace Transformers, DeepSpeed, vLLM, FlashAttention, FSDP, LoRAAgent Orchestration: LangChain, LangGraph, ReAct, OpenAgents, LlamaIndexVector DBs: Weaviate, Qdrant, FAISS, Pinecone, ChromaGraph Knowledge Systems: Neo4j, Puppygraph, RDF, Gremlin, JSON-LDStorage & Access: Iceberg, DuckDB, Postgres, Parquet, Delta LakeEvaluation: OpenLLM Evals, Trulens, Ragas, LangSmith, Weight & BiasesCompute: Ray, Kubernetes, TGI, Sagemaker, LambdaLabs, ModalLanguages: Python (core), optionally Rust (for inference layers) or JS (for UX experimentation)Soft Skills & MindsetStartup DNA: resourceful, fast-moving, and capable of working in ambiguityDeep curiosity about agent-based architectures and real-world enterprise complexityComfortable owning model performance end-to-end: from dataset to deploymentStrong instincts around explainability, safety, and continuous improvementEnjoy pair-designing with product and UX to shape capabilities, not just APIsWhy This Role MattersThis role is foundational to our thesis: that agents + enterprise data + knowledge modeling can create intelligent infrastructure for real-world, multi-billion-dollar workflows. Your work won’t be buried in research reports — it will be productionized and activated by hundreds of users and hundreds of thousands of decisions. If this is your dream role - we would love to hear from you.
#J-18808-Ljbffr