Michael Baker International, Inc.
Lead Database Developer - AI/ML Focus
Michael Baker International, Inc., Denver, Colorado, United States, 80285
Michael Baker International is seeking a skilled Lead Database Developer with AI/ML expertise to architect, build and scale intelligent, data-driven applications across our enterprise ecosystem. As the Lead Database Developer, you will architect, build and optimise enterprise-grade data platforms that power AI/ML products, analytics and automation initiatives. You will lead database developers, partner with data scientists, and own the roadmap for scalable data systems that enable real‑time insights and model-driven decision-making. This position reports to VP of Data and AI in the CTO Organization at Michael Baker International.
RESPONSIBILITIES
Lead design of scalable data pipelines, ingestion frameworks and distributed processing systems. Architect enterprise data lake/lakehouse/warehouse solutions (Databricks, Snowflake, BigQuery, Redshift). Guide data engineers on best practices, code quality and scalable data engineering patterns. Own end‑to‑end execution of data engineering initiatives, including estimation, delivery and performance optimisation. Build ML‑ready data environments, feature stores and training pipelines. Partner with data scientists to productionise ML models with CI/CD/CT. Implement model monitoring, data quality, feature versioning and automated retraining. Support real‑time and batch feature engineering and inference pipelines. Develop scalable ELT/ETL pipelines using Spark, PySpark, SQL, Airflow, DBT, Kafka, Kinesis. Build high‑quality data models (dimensional, data vault, lakehouse). Implement observability, lineage and data quality frameworks across all pipelines. Architect MLOps pipelines using Docker, Kubernetes, Terraform, MLflow, SageMaker or Vertex AI. Optimise cloud cost, performance and reliability for large‑scale AI/ML workloads. Drive standards for cloud data infrastructure and reusable data engineering components. Governance, Security & Compliance
Ensure compliance with SOC2, GDPR, PII standards based on company needs. Implement secure data‑sharing, encryption, IAM, tokenisation and access patterns. Maintain metadata, cataloguing and governance processes (Collibra, Alation, Unity Catalog). Champion emerging technologies including GenAI, vector databases and LLM‑based pipelines. Drive innovation in AI/ML data engineering and real‑time analytics. Team Development and Stakeholder Engagement
Lead and mentor data engineering teams. Collaborate with data scientists, ML engineers and business stakeholders to deliver impactful solutions. Translate business requirements into scalable data strategies. PROFESSIONAL REQUIREMENTS
Bachelor’s degree in Computer Science or related field, or similar, or equivalent experience. Any Data or AI/ML related certifications. 6–12+ years of data engineering experience with 2–5+ years in a lead role. Strong programming in Python, SQL; deep expertise in Spark/Databricks. Experience building ML‑ready architectures, feature stores and MLOps pipelines. Expertise with cloud platforms (AWS, Azure or GCP). Proven ability to lead engineering teams, mentor junior engineers and drive architectural decisions. PREFERRED QUALIFICATIONS
Experience implementing vector databases (Pinecone, FAISS, Milvus) and LLM‑based pipelines including RAG. Background in real‑time analytics and low‑latency ML inference. Experience in highly regulated industries (healthcare, fintech, retail, AEC, manufacturing). Ensure quality, compliance and security across all data platforms while implementing observability, lineage and governance frameworks. Define and execute enterprise data strategies aligned with AI/ML initiatives while championing best practices in data engineering, MLOps and cloud optimisation. COMPENSATION
The approximate compensation range for this position is $130,000 to $170,000. This compensation range is a good‑faith estimate for the position at the time of posting. Actual compensation is dependent upon factors such as education, qualifications, experience, skillset and physical work location. BENEFITS
We offer a comprehensive benefits package including: 401(k) Retirement Plan Health Savings Account (HSA) Flexible Spending Account (FSA) Life, AD&D, short‑term and long‑term disability Professional and personal development Generous paid time off Commuter and wellness benefits Job Info
Job Identification 308820 Posting Date 11/25/2025, 02:16 PM Job Schedule Full time
#J-18808-Ljbffr
Lead design of scalable data pipelines, ingestion frameworks and distributed processing systems. Architect enterprise data lake/lakehouse/warehouse solutions (Databricks, Snowflake, BigQuery, Redshift). Guide data engineers on best practices, code quality and scalable data engineering patterns. Own end‑to‑end execution of data engineering initiatives, including estimation, delivery and performance optimisation. Build ML‑ready data environments, feature stores and training pipelines. Partner with data scientists to productionise ML models with CI/CD/CT. Implement model monitoring, data quality, feature versioning and automated retraining. Support real‑time and batch feature engineering and inference pipelines. Develop scalable ELT/ETL pipelines using Spark, PySpark, SQL, Airflow, DBT, Kafka, Kinesis. Build high‑quality data models (dimensional, data vault, lakehouse). Implement observability, lineage and data quality frameworks across all pipelines. Architect MLOps pipelines using Docker, Kubernetes, Terraform, MLflow, SageMaker or Vertex AI. Optimise cloud cost, performance and reliability for large‑scale AI/ML workloads. Drive standards for cloud data infrastructure and reusable data engineering components. Governance, Security & Compliance
Ensure compliance with SOC2, GDPR, PII standards based on company needs. Implement secure data‑sharing, encryption, IAM, tokenisation and access patterns. Maintain metadata, cataloguing and governance processes (Collibra, Alation, Unity Catalog). Champion emerging technologies including GenAI, vector databases and LLM‑based pipelines. Drive innovation in AI/ML data engineering and real‑time analytics. Team Development and Stakeholder Engagement
Lead and mentor data engineering teams. Collaborate with data scientists, ML engineers and business stakeholders to deliver impactful solutions. Translate business requirements into scalable data strategies. PROFESSIONAL REQUIREMENTS
Bachelor’s degree in Computer Science or related field, or similar, or equivalent experience. Any Data or AI/ML related certifications. 6–12+ years of data engineering experience with 2–5+ years in a lead role. Strong programming in Python, SQL; deep expertise in Spark/Databricks. Experience building ML‑ready architectures, feature stores and MLOps pipelines. Expertise with cloud platforms (AWS, Azure or GCP). Proven ability to lead engineering teams, mentor junior engineers and drive architectural decisions. PREFERRED QUALIFICATIONS
Experience implementing vector databases (Pinecone, FAISS, Milvus) and LLM‑based pipelines including RAG. Background in real‑time analytics and low‑latency ML inference. Experience in highly regulated industries (healthcare, fintech, retail, AEC, manufacturing). Ensure quality, compliance and security across all data platforms while implementing observability, lineage and governance frameworks. Define and execute enterprise data strategies aligned with AI/ML initiatives while championing best practices in data engineering, MLOps and cloud optimisation. COMPENSATION
The approximate compensation range for this position is $130,000 to $170,000. This compensation range is a good‑faith estimate for the position at the time of posting. Actual compensation is dependent upon factors such as education, qualifications, experience, skillset and physical work location. BENEFITS
We offer a comprehensive benefits package including: 401(k) Retirement Plan Health Savings Account (HSA) Flexible Spending Account (FSA) Life, AD&D, short‑term and long‑term disability Professional and personal development Generous paid time off Commuter and wellness benefits Job Info
Job Identification 308820 Posting Date 11/25/2025, 02:16 PM Job Schedule Full time
#J-18808-Ljbffr