Oversight

ML Engineer Agentic AI

Oversight, Atlanta, Georgia, United States, 30383

Get AI-powered advice on this job and more exclusive features.

Direct message the job poster from Oversight

Oversight is the world’s leading provider of AI-based spend management and risk mitigation solutions for large enterprises. Based in Atlanta, GA, Oversight works with many of the world’s most innovative companies and government agencies to digitally transform their spend audit and financial control processes.

Oversight’s AI-powered platform works across our customers’ financial systems to continuously monitor and analyze all spend transactions for fraud, waste, and misuse. With a consolidated, consistent view of risk across their enterprise, customers can prevent financial loss and optimize spend while strengthening the controls that improve compliance. LearnMore.

Position Overview We are seeking a skilled and forward‑looking ML Engineer with experience in Large Language Models (LLMs), generative AI, and agentic architectures to join our growing R&D and Applied AI team. This role is critical in helping Oversight deliver the next generation of agentic AI systems for enterprise spend management and risk controls.

The ideal candidate has a strong foundation in machine learning, modern deep learning frameworks, and data pipelines, coupled with hands‑on experience experimenting with LLMs, small language models (SLMs), multi‑agent frameworks, and retrieval‑augmented generation (RAG).

You will work closely with AI/ML researchers, data engineers, and product teams to design, implement, and optimize models that power autonomous exception resolution, anomaly detection, and explainable insights. This is a hands‑on engineering role where you will not only build and scale ML systems but also actively contribute to cutting‑edge applied research in agentic AI.

Core ML/LLM Engineering

Contribute to the design, training, fine‑tuning, and deployment of ML/LLM models for production.

Work with frameworks like LangChain, LangGraph, MCP to prototype and optimize multi‑agent workflows.

Develop prompt engineering, optimization, and safety techniques for agentic LLM interactions.

Integrate memory, evidence packs, and explainability modules into agentic pipelines.

Work hands‑on with multiple LLM ecosystems:

OpenAI GPT models (GPT‑4, GPT‑4o, fine‑tuned GPTs).

Anthropic Claude (Claude 2/3 for reasoning and safety‑aligned workflows).

Google Gemini (multimodal reasoning, advanced RAG integration).

Meta LLaMA (fine‑tuned/custom models for domain‑specific tasks).

Data & Infrastructure

Collaborate with Data Engineering to build and maintain real‑time and batch data pipelines that serve ML/LLM workloads.

Conduct feature engineering, preprocessing, and embeddings generation for structured and unstructured data.

Implement model monitoring, drift detection, and retraining pipelines.

Leverage cloud ML platforms (AWS SageMaker, Databricks ML) for experimentation and scaling.

Explore and evaluate emerging LLM/SLM architectures and agent orchestration patterns.

Experiment with generative AI and multimodal models to extend capabilities beyond text (images, structured financial data).

Collaborate with R&D to prototype autonomous resolution agents, anomaly detection models, and reasoning engines.

Translate research prototypes into production‑ready components.

Work cross‑functionally with R&D, Data Science, Product, and Engineering to deliver business‑aligned AI features.

Participate in design reviews, architecture discussions, and model evaluations.

Document processes, experiments, and results effectively for knowledge sharing.

Mentor junior engineers and contribute to ML engineering best practices.

Education, Experience and Skills Required

Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or related field.

3+ years of experience building and deploying ML systems.

Proficiency in Python and libraries such as PyTorch, TensorFlow, Scikit‑Learn, Hugging Face Transformers.

Hands‑on experience with LLMs/SLMs (fine‑tuning, prompt design, inference optimization).

Demonstrated experience with at least two of the following ecosystems:

OpenAI GPT models (chat, assistants, fine‑tuning).

Anthropic Claude (safety‑first AI for reasoning and summarization).

Meta LLaMA (open‑source, fine‑tuned models).

Familiarity with vector databases, embeddings, and RAG pipelines.

Ability to work with structured and unstructured data at scale.

Knowledge of SQL and distributed data frameworks (Spark, Ray).

Strong understanding of ML lifecycle: data prep, training, evaluation, deployment, monitoring.

Preferred Qualifications

Experience with agentic frameworks (LangChain, LangGraph, MCP, AutoGen).

Knowledge of AI safety, guardrails, and explainability techniques.

Hands‑on experience deploying ML/LLM solutions in cloud environments (AWS, GCP, Azure).

Experience with CI/CD for ML (MLOps), monitoring, and observability.

Familiarity with anomaly detection, fraud/risk modeling, or behavioral analytics.

Contributions to open‑source AI/ML projects or publications in applied ML research.

Seniority level

Mid‑Senior level

Employment type

Full‑time

Job function

Engineering and Information Technology

Industries

Software Development

Benefits

Medical insurance

Vision insurance

401(k)

Paid maternity leave

Paid paternity leave

Tuition assistance

Disability insurance

Referrals increase your chances of interviewing at Oversight by 2x

#J-18808-Ljbffr