Eightelevengroup
Data Scientist
Brooksource
Fortune 500 Oil & Gas Client
Houston
Overview
Our client's Digital Solutions team is expanding its advanced analytics and AI capabilities to support major enterprise initiatives—including the BAP AI Program, Sales Enablement AI, and emerging agentic workflow automation projects. With current headcount limited to two scientists, the team is building a strong bench of high-performance Data Scientists skilled in GenAI, enterprise ML engineering, and production-scale model deployment.
This role is ideal for a senior-level Data Scientist with hands-on experience developing, fine-tuning, and deploying models—especially LLM, RAG, and agentic systems—within complex industrial or enterprise environments.
What You’ll Do
Design & deliver GenAI solutions: Architect and implement LLM/LVM applications (text and vision), including prompt strategies, guardrails, evaluation frameworks, and production rollout.
Build robust RAG systems: Develop end-to-end RAG pipelines (ingestion → chunking → embedding → retrieval → synthesis) with observability, feedback loops, and AB testing for groundedness and hallucination reduction.
Fine-tune foundation models: Adapt and fine-tune open-source or hosted LLMs using LoRA/QLoRA or other parameter-efficient methods; manage evaluation datasets and benchmark performance.
Develop agentic workflows: Implement multi-step, tool-using AI agents with planning, memory, tool-calling, and safe execution policies.
ML/LLM engineering: Build high-quality Python libraries, APIs, and production workflows with CI/CD, automated testing, experiment tracking, and model governance.
Data & platforms: Collaborate with Data Engineering to operationalize pipelines on Databricks/Snowflake and integrate models with enterprise cloud AI/ML services.
Must-Have Skills
Python (advanced; architecture, packaging, testing, performance)
Machine Learning Engineering using Databricks or MLflow
Built & deployed models in enterprise production environments
LLM or RAG experience, including prior fine-tuning of any LLM
Industrial or complex enterprise industry experience (strong preference)
Nice-to-Have Skills
Oil & Gas or Industrial domain background
Experience at large/enterprise companies
Master’s degree (preferred)
Required Qualifications Mathematics & ML Foundations
Strong knowledge of Statistics, Linear Algebra, Calculus, and Optimization
Ability to explain trade-offs in model design and behavior
Programming
Advanced Python (typing, async, clean software architecture)
API and library development
Python Ecosystem
Data: pandas, NumPy, Matplotlib/Seaborn, PySpark
ML: scikit-learn, XGBoost, LightGBM
Deep Learning: PyTorch or TensorFlow
Generative AI: LangChain, LlamaIndex, Haystack, Hugging Face
Generative AI Expertise
Experience with multiple LLM/LVM providers (GPT, Claude, Gemini, Llama, etc.)
Proven foundation model fine-tuning experience
End-to-end RAG system development
Prompt engineering and multi-step agentic workflows
Embedding models and vector search systems
Software Craftsmanship & Platforms
Git (GitHub/GitLab/Bitbucket)
SQL + NoSQL databases
Cloud ML platforms: AWS, Azure, or GCP (SageMaker, Azure ML, Vertex)
Databricks or Snowflake
Azure AI Foundry (required)
Preferred Qualifications
Vector Stores: FAISS, Milvus, Pinecone, Weaviate; hybrid retrieval
LLMOps/Observability: MLflow, LangSmith, OpenTelemetry, RAGAS/DeepEval
Orchestration: Airflow, Prefect
Data Engineering: Kafka, Delta, Parquet, Unity Catalog
Containers & Infra: Docker, Kubernetes, GitHub Actions, Azure DevOps, Terraform
Safety & Governance: content moderation, jailbreak defense, PII protection
MLOps Patterns: shadow/canary deployments, AB testing, blue-green rollouts
Domain: Energy, industrial IoT, manufacturing, or similar data-rich environments
#J-18808-Ljbffr
This role is ideal for a senior-level Data Scientist with hands-on experience developing, fine-tuning, and deploying models—especially LLM, RAG, and agentic systems—within complex industrial or enterprise environments.
What You’ll Do
Design & deliver GenAI solutions: Architect and implement LLM/LVM applications (text and vision), including prompt strategies, guardrails, evaluation frameworks, and production rollout.
Build robust RAG systems: Develop end-to-end RAG pipelines (ingestion → chunking → embedding → retrieval → synthesis) with observability, feedback loops, and AB testing for groundedness and hallucination reduction.
Fine-tune foundation models: Adapt and fine-tune open-source or hosted LLMs using LoRA/QLoRA or other parameter-efficient methods; manage evaluation datasets and benchmark performance.
Develop agentic workflows: Implement multi-step, tool-using AI agents with planning, memory, tool-calling, and safe execution policies.
ML/LLM engineering: Build high-quality Python libraries, APIs, and production workflows with CI/CD, automated testing, experiment tracking, and model governance.
Data & platforms: Collaborate with Data Engineering to operationalize pipelines on Databricks/Snowflake and integrate models with enterprise cloud AI/ML services.
Must-Have Skills
Python (advanced; architecture, packaging, testing, performance)
Machine Learning Engineering using Databricks or MLflow
Built & deployed models in enterprise production environments
LLM or RAG experience, including prior fine-tuning of any LLM
Industrial or complex enterprise industry experience (strong preference)
Nice-to-Have Skills
Oil & Gas or Industrial domain background
Experience at large/enterprise companies
Master’s degree (preferred)
Required Qualifications Mathematics & ML Foundations
Strong knowledge of Statistics, Linear Algebra, Calculus, and Optimization
Ability to explain trade-offs in model design and behavior
Programming
Advanced Python (typing, async, clean software architecture)
API and library development
Python Ecosystem
Data: pandas, NumPy, Matplotlib/Seaborn, PySpark
ML: scikit-learn, XGBoost, LightGBM
Deep Learning: PyTorch or TensorFlow
Generative AI: LangChain, LlamaIndex, Haystack, Hugging Face
Generative AI Expertise
Experience with multiple LLM/LVM providers (GPT, Claude, Gemini, Llama, etc.)
Proven foundation model fine-tuning experience
End-to-end RAG system development
Prompt engineering and multi-step agentic workflows
Embedding models and vector search systems
Software Craftsmanship & Platforms
Git (GitHub/GitLab/Bitbucket)
SQL + NoSQL databases
Cloud ML platforms: AWS, Azure, or GCP (SageMaker, Azure ML, Vertex)
Databricks or Snowflake
Azure AI Foundry (required)
Preferred Qualifications
Vector Stores: FAISS, Milvus, Pinecone, Weaviate; hybrid retrieval
LLMOps/Observability: MLflow, LangSmith, OpenTelemetry, RAGAS/DeepEval
Orchestration: Airflow, Prefect
Data Engineering: Kafka, Delta, Parquet, Unity Catalog
Containers & Infra: Docker, Kubernetes, GitHub Actions, Azure DevOps, Terraform
Safety & Governance: content moderation, jailbreak defense, PII protection
MLOps Patterns: shadow/canary deployments, AB testing, blue-green rollouts
Domain: Energy, industrial IoT, manufacturing, or similar data-rich environments
#J-18808-Ljbffr