Logo
BALIN TECHNOLOGIES LLC

Data Architect (Databricks)

BALIN TECHNOLOGIES LLC, Denver

Save Job

We are seeking a seasoned Data Architect with deep expertise in Databricks, Lakehouse architecture, and AI/ML/GenAI enablement to lead a critical modernization initiative. The role involves transforming a legacy platform into a future‑ready, scalable, cloud‑native Databricks‑based architecture. You will drive design and implementation of high‑performance data pipelines, orchestrate data workflows, and integrate AI/ML capabilities across the stack to unlock real‑time intelligence and innovation.

Responsibilities

  • Lead the architectural modernization from an on‑prem/legacy platform to a unified Databricks Lakehouse ecosystem.
  • Architect and optimize data pipelines (batch and streaming) to support AI/ML and GenAI workloads on Databricks.
  • Migrate and re‑engineer existing Spark workloads to leverage Delta Lake, Unity Catalog, and advanced performance tuning in Databricks.
  • Drive integration of AI/ML models (including GenAI use cases) into operational data pipelines for real‑time decision‑making.
  • Design and implement robust orchestration using Apache Airflow or Databricks Workflows, with CI/CD integration.
  • Establish data governance, security, and quality frameworks aligned with Unity Catalog and enterprise standards.
  • Collaborate with data scientists, ML engineers, DevOps, and business teams to enable scalable and governed AI solutions.

Required Skills

  • 12+ years in data engineering or architecture, with a strong focus on Databricks (at least 4‑5 years) and AI/ML enablement.
  • Deep hands‑on experience with Apache Spark, Databricks (Azure/AWS), and Delta Lake.
  • Proficiency in AI/ML pipeline integration using Databricks MLflow or custom model deployment strategies.
  • Strong knowledge of Apache Airflow, Databricks Jobs, and cloud‑native orchestration patterns.
  • Experience with structured streaming, Kafka, and real‑time analytics frameworks.
  • Proven ability to design and implement cloud‑native data architectures.
  • Solid understanding of data modelling, Lakehouse design principles, and lineage/tracking with Unity Catalog.
  • Excellent communication and stakeholder engagement skills.

Preferred Qualifications

  • Certification in Databricks Data Engineering Professional.
  • Experience transitioning from in‑house data platforms to Databricks or cloud‑native environments.
  • Hands‑on experience with Delta Lake, Unity Catalog, and performance tuning in Databricks.
  • Expertise in Apache Airflow DAG design, dynamic workflows, and production troubleshooting.
  • Experience with CI/CD pipelines, Infrastructure‑as‑Code (Terraform, ARM templates), and DevOps practices.
  • Exposure to AI/ML model integration within real‑time or batch data pipelines.
  • Exposure to MLOps, MLflow, Feature Store, and model monitoring in production environments.
  • Experience with LLM/GenAI enablement, vectorized data, embedding storage, and integration with Databricks.

#J-18808-Ljbffr