Logo
Eightelevengroup

Data Scientist

Eightelevengroup, Houston, Texas, United States, 77246

Save Job

Data Scientist Brooksource Fortune 500 Oil & Gas Client Houston Overview Our client's Digital Solutions team is expanding its advanced analytics and AI capabilities to support major enterprise initiatives—including the BAP AI Program, Sales Enablement AI, and emerging agentic workflow automation projects. With current headcount limited to two scientists, the team is building a strong bench of high-performance Data Scientists skilled in GenAI, enterprise ML engineering, and production-scale model deployment.

This role is ideal for a senior-level Data Scientist with hands-on experience developing, fine-tuning, and deploying models—especially LLM, RAG, and agentic systems—within complex industrial or enterprise environments.

What You’ll Do

Design & deliver GenAI solutions: Architect and implement LLM/LVM applications (text and vision), including prompt strategies, guardrails, evaluation frameworks, and production rollout.

Build robust RAG systems: Develop end-to-end RAG pipelines (ingestion → chunking → embedding → retrieval → synthesis) with observability, feedback loops, and AB testing for groundedness and hallucination reduction.

Fine-tune foundation models: Adapt and fine-tune open-source or hosted LLMs using LoRA/QLoRA or other parameter-efficient methods; manage evaluation datasets and benchmark performance.

Develop agentic workflows: Implement multi-step, tool-using AI agents with planning, memory, tool-calling, and safe execution policies.

ML/LLM engineering: Build high-quality Python libraries, APIs, and production workflows with CI/CD, automated testing, experiment tracking, and model governance.

Data & platforms: Collaborate with Data Engineering to operationalize pipelines on Databricks/Snowflake and integrate models with enterprise cloud AI/ML services.

Must-Have Skills

Python (advanced; architecture, packaging, testing, performance)

Machine Learning Engineering using Databricks or MLflow

Built & deployed models in enterprise production environments

LLM or RAG experience, including prior fine-tuning of any LLM

Industrial or complex enterprise industry experience (strong preference)

Nice-to-Have Skills

Oil & Gas or Industrial domain background

Experience at large/enterprise companies

Master’s degree (preferred)

Required Qualifications Mathematics & ML Foundations

Strong knowledge of Statistics, Linear Algebra, Calculus, and Optimization

Ability to explain trade-offs in model design and behavior

Programming

Advanced Python (typing, async, clean software architecture)

API and library development

Python Ecosystem

Data: pandas, NumPy, Matplotlib/Seaborn, PySpark

ML: scikit-learn, XGBoost, LightGBM

Deep Learning: PyTorch or TensorFlow

Generative AI: LangChain, LlamaIndex, Haystack, Hugging Face

Generative AI Expertise

Experience with multiple LLM/LVM providers (GPT, Claude, Gemini, Llama, etc.)

Proven foundation model fine-tuning experience

End-to-end RAG system development

Prompt engineering and multi-step agentic workflows

Embedding models and vector search systems

Software Craftsmanship & Platforms

Git (GitHub/GitLab/Bitbucket)

SQL + NoSQL databases

Cloud ML platforms: AWS, Azure, or GCP (SageMaker, Azure ML, Vertex)

Databricks or Snowflake

Azure AI Foundry (required)

Preferred Qualifications

Vector Stores: FAISS, Milvus, Pinecone, Weaviate; hybrid retrieval

LLMOps/Observability: MLflow, LangSmith, OpenTelemetry, RAGAS/DeepEval

Orchestration: Airflow, Prefect

Data Engineering: Kafka, Delta, Parquet, Unity Catalog

Containers & Infra: Docker, Kubernetes, GitHub Actions, Azure DevOps, Terraform

Safety & Governance: content moderation, jailbreak defense, PII protection

MLOps Patterns: shadow/canary deployments, AB testing, blue-green rollouts

Domain: Energy, industrial IoT, manufacturing, or similar data-rich environments

#J-18808-Ljbffr