Logo
Eloquent AI

AI Engineer, .*RAG

Eloquent AI, San Francisco

Save Job

Meet Eloquent AI

At Eloquent AI, we’re building the next generation of AI Operators—multimodal, autonomous systems that execute complex workflows across fragmented tools with human-level precision. Our technology goes far beyond chat: it sees, reads, clicks, types, and makes decisions—transforming how work gets done in regulated, high-stakes environments.

We’re already powering some of the world’s leading financial institutions and insurers, fundamentally changing how millions of people manage their finances every day. From automating compliance reviews to handling customer operations, our Operators are quietly replacing repetitive, manual tasks with intelligent, end-to-end execution.

Headquartered in San Francisco with a global footprint, Eloquent AI is a fast-growing company backed by top-tier investors. Join us to work alongside world-class talent in AI, engineering, and product as we redefine the future of financial services.

Your Role

As a Senior AI Engineer, .*RAG at Eloquent AI, you will play a critical role in designing, building, and optimizing Any Retrieval-Augmented Generation (.*RAG) systems that power our enterprise AI agents. You will work on scalable, high-performance AI infrastructure, ensuring our LLM-powered agents deliver accurate, real-time responses with deep knowledge retrieval.

This role requires a strong software engineering background, expertise in LLMs, RAG architectures, and agentic frameworks, and the ability to translate cutting-edge research into production-ready AI systems. You will collaborate with researchers, engineers, and product teams to advance our AI capabilities and ensure that our agents retrieve and generate knowledge with precision and efficiency.

You will:

  • Design and implement scalable RAG pipelines that enable AI agents to retrieve and generate knowledge in real time.

  • Develop and optimize knowledge retrieval systems , fine-tuning embeddings, vector search, and ranking models.

  • Work with LLM architectures , applying prompt engineering, fine-tuning, and reinforcement learning techniques to improve response accuracy.

  • Optimize large-scale AI workloads , ensuring low latency and high efficiency for enterprise-grade AI applications.

  • Collaborate with AI researchers to translate state-of-the-art RAG advancements into deployable, high-performing solutions.

  • Leverage cloud infrastructure (AWS, GCP, or Azure) to build distributed, high-availability AI systems.

  • Continuously improve knowledge ingestion , ensuring AI agents stay up-to-date with evolving enterprise datasets.

Requirements

  • 5+ years of software engineering experience, with a focus on AI, NLP, or distributed systems.

  • Strong proficiency in Python and experience with AI frameworks like PyTorch and TensorFlow.

  • Expertise in RAG architectures, including experience with vector databases (e.g., FAISS, Weaviate, Pinecone, Milvus) and document retrieval methods.

  • Familiarity with LLM training, knowledge distillation, and agentic frameworks.

  • Experience with cloud computing and building scalable, production-ready AI applications.

  • Ability to optimize AI models for efficiency, balancing accuracy, latency, and cost.

  • Deep understanding of NLP and IR techniques, including tokenization, embeddings, ranking algorithms, and their evaluation.

Bonus Points If…

  • You have published research in AI, NLP, or RAG-related topics at top-tier conferences (NeurIPS, ICML, ICLR, ACL, SIGIR, etc.).

  • You have experience implementing hybrid RAG pipelines, combining retrieval with multi-step reasoning and tool use.

  • You’ve worked in high-performance AI teams, scaling AI-driven applications in fast-growth environments.

  • You have experience with Reinforcement Learning from Human Feedback (RLHF) and optimizing LLMs for enterprise use cases.

  • You are comfortable working in cross-functional AI product teams, collaborating with researchers, engineers, and product managers.

#J-18808-Ljbffr