Vidorra Consulting Group

AI Research Engineer / Data Scientist (LLM)

Vidorra Consulting Group, Convent Station, New Jersey, us, 07961

Position: AI Research Engineer / Data Scientist (LLM) -Senior Job location: Morristown NJ-Onsite ( Tri state candidate ) Fulltime Role Summary Own end-to-end delivery of LLM systems and agentic workflows. You’ll drive architecture and evaluation strategy, productionize services with reliability and guardrails, and mentor juniors while partnering with product and stakeholders.

What You’ll Do

Lead POC pilot production for LLM/agent solutions (tool/function calling, planning, fallbacks).

Architect retrieval stacks (chunking strategies, hybrid search, metadata, re-ranking) and hallucination controls.

Design offline online evaluations, golden sets, CI eval gates, and experiment frameworks for correctness, faithfulness, safety, and bias.

Implement confidence scoring and calibration (retrieval/LLM agreement, self-consistency, logprobs/entropy) with abstain/deferral and user-visible citations.

Optimize cost/latency/reliability (caching, batching, routing, distillation/quantization where useful).

Stand up secure APIs/services with observability, tracing, RBAC, and audit logs; uphold privacy and compliance requirements.

Advise on prompting vs. fine-tuning/LoRA; own model and vendor selection and trade-offs.

Mentor teammates; collaborate on roadmaps and stakeholder-facing KPIs.

Must-Have (Core)

4–8 years in applied ML/data/engineering, with shipped LLM applications (RAG, agent/tool calling, structured extraction, domain QA).

Demonstrated ownership of eval strategy (golden sets, pass/fail thresholds, A/B) and integration of eval gates into CI/CD.

Hands‑on confidence & calibration techniques and production abstention/deferral policies.

Deep retrieval experience (hybrid search, filters, re‑ranking, chunking) and mitigation of hallucinations.

Strong Python engineering (FastAPI), SQL/data wrangling, containers, and observability (logs/spans/metrics).

Cloud experience (Azure/AWS/Google Cloud Platform) and vector/search tech (e.g., Azure AI Search, Elasticsearch, Pinecone/FAISS).

Track record of taking ambiguous business problems to measurable outcomes and shipping on timelines.

Nice to Have

Fine‑tuning/LoRA, prompt routing, multi‑agent orchestration (planner/critic/tool catalogs).

Advanced uncertainty: calibration curves, conformal methods, or LLM-as‑a‑judge with safeguards.

Distillation/quantization and model serving efficiency; GPU/CPU trade‑offs.

Document AI (layout‑aware models, VLMs), Snowflake/Databricks, Airflow/Prefect.

Safety/compliance leadership (red‑teaming playbooks, model cards, DPIA/PIA inputs).

Domain expertise (e.g., insurance/finance/healthcare) and regulated data handling.

#J-18808-Ljbffr