ExlService Holdings, Inc.

AVP / Lead AI Engineer

ExlService Holdings, Inc., Salt Lake City, Utah, United States

Locations Exl - Utah - UT (Work From Home) Job Role Application Development-Applications Development Engineering Experience (In Years) 9-12

Job Description Responsibilities Key Responsibilities

1. RAG Development & Optimization

Design and implement

Retrieval-Augmented Generation pipelines

to ground LLMs in enterprise or domain-specific data. Make strategic decisions on

chunking strategy ,

embedding models , and

retrieval mechanisms

to balance context precision, recall, and latency. Work with

vector databases

(Qdrant, Weaviate, pgvector, Pinecone) and

embedding frameworks

(OpenAI, Hugging Face, Instructor, etc.). Diagnose and iterate on challenges like

chunk size trade-offs ,

retrieval quality ,

context window limits , and

grounding accuracy —using structured evaluation and metrics.

2. Chatbot Quality & Evaluation Frameworks

Establish comprehensive

evaluation frameworks

for LLM applications, combining quantitative (BLEU, ROUGE, response time) and qualitative methods (human evaluation, LLM-as-a-judge, relevance, coherence, user satisfaction). Implement continuous monitoring and automated regression testing using tools like

LangSmith ,

LangFuse ,

Arize , or

custom evaluation harnesses . Identify and prevent quality degradation, hallucinations, or factual inconsistencies before production release. Collaborate with design and product to define

success metrics

and

user feedback loops

for ongoing improvement.

3. Guardrails, Safety & Responsible AI

Implement

multi-layered guardrails

across input validation, output filtering, prompt engineering, re-ranking, and abstention (“I don’t know”) strategies. Use frameworks such as

Guardrails AI ,

NeMo Guardrails , or

Llama Guard

to ensure compliance, safety, and brand integrity. Build

policy-driven safety systems

for handling sensitive data, user content, and edge cases with clear escalation paths. Balance

safety, user experience, and helpfulness , knowing when to block, rephrase, or gracefully decline responses.

4. Multi-Agent Systems & Orchestration

Design and operate

multi-agent workflows

using orchestration frameworks such as

LangGraph ,

AutoGen ,

CrewAI , or

Haystack . Coordinate routing logic, task delegation, and parallel vs. sequential agent execution to handle complex reasoning or multi-step tasks. Build observability and debugging tools for tracking agent interactions, performance, and cost optimization. Evaluate trade-offs around

latency, reliability, and scalability

in production-grade multi-agent environments.

Qualifications Minimum Qualifications

10+ years of experience in Data Science, Data Engineering, or Machine Learning. Bachelor’s Degree in Computer Science, Information Systems, or a related field. Proficiency in Python

(FastAPI, Flask, asyncio), GCP experience is good to have Demonstrated

hands-on RAG implementation experience

with specific tools, models, and evaluation metrics. Practical knowledge of

agentic frameworks

(LangGraph, LangChain) and

evaluation ecosystems

(LangFuse, LangSmith). Excellent

communication skills , proven ability to

collaborate cross-functionally , and a

low-ego, ownership-driven

work style.

Preferred / Good-to-Have Qualifications

Experience in

traditional AI/ML workflows

— e.g., model training, feature engineering, and deployment of ML models (scikit-learn, TensorFlow, PyTorch). Familiarity with

retrieval optimization ,

prompt tuning , and

tool-use evaluation . Background in

observability and performance profiling

for large-scale AI systems. Understanding of

security and privacy

principles for AI systems (PII redaction, authentication/authorization, RBAC) Exposure to

enterprise chatbot systems ,

LLMOps pipelines , and

continuous model evaluation

in production.

#J-18808-Ljbffr