Logo
ANGARAI

Databricks Engineer

ANGARAI, College Park, Maryland, us, 20741

Save Job

Databricks Engineer



ANGARAI We are seeking a highly skilled Databricks Engineer to design, build, and operate a modern Data & AI Platform leveraging the Medallion Architecture (Bronze, Silver, Gold). This role will lead the development of scalable ELT pipelines, orchestrate complex data workflows and integrate enterprise data from platforms such as PeopleSoft, D2L and Salesforce. The engineer will ensure high‑quality, governed, analytics‑ready data to support enterprise decision‑making, BI, AI and machine learning initiatives.

1. Data & AI Platform Engineering (Databricks‑Centric)

Design, implement, and optimize end‑to‑end pipelines using Databricks and Medallion Architecture best practices.

Build robust Spark/Delta Lake ETL/ELT frameworks for Bronze, Silver and Gold layers.

Develop and operationalize Databricks Workflows for orchestration, automation, dependency management and scheduling.

Apply schema evolution, data versioning and Delta Lake optimization techniques.

2. Platform & Data Ingestion

Integrate data from enterprise systems (PeopleSoft, Salesforce, D2L) via APIs, JDBC and other ingestion frameworks.

Build connectors for structured, semi‑structured and unstructured data sources.

Standardize ingestion processes with automated error handling, retry logic and alerting.

3. Data Quality, Monitoring & Governance

Develop data quality checks, validation rules, anomaly detection and reconciliation logic.

Implement monitoring solutions using tools such as Databricks metrics, Delta logs and Grafana.

Implement metadata management, lineage tracking and governance using Unity Catalog or equivalent services.

4. Security, Privacy & Compliance

Implement data security practices including row‑level security, encryption in‑transit and at rest and fine‑grained access controls.

Implement data masking,ization, anonymization, and privacy controls aligned with GDPR, FERPA and related regulations.

Collaborate with security teams for audits and compliance evaluations.

5. AI/ML‑Ready Data Foundation

Deliver high‑quality datasets for machine learning and advanced analytics.

Enable MLOps workflows using MLflow for experiment tracking, model versioning and deployment.

Partner with AI/ML teams to build reusable features, feature stores and standardized training pipelines.

6. Cloud Data Architecture & Storage

Architect cloud‑native data solutions using ADLS or Amazon S3.

Build data lakes, data marts, and warehouse components optimized for performance and scalability.

Apply cost‑efficient storage designs, partitioning strategies and access optimizations.

7. Documentation & Enablement

Create and maintain architecture diagrams, data dictionaries, runbooks and technical documentation.

Train internal teams on Databricks, Medallion Architecture and governance frameworks.

Conduct code reviews and promote reusable engineering patterns.

8. Reporting & Accountability

Submit weekly reports detailing hours worked, completed tasks, upcoming work and blockers.

Track deliverables against project milestones and communicate risks.

Requirements Required Qualifications

Hands‑on experience with Databricks, Spark and Delta Lake.

Strong understanding of ELT development, orchestration and monitoring.

Experience implementing Medallion Architecture and schema enforcement.

Proficiency in Python, SQL or Scala.

Experience integrating enterprise systems (PeopleSoft, Salesforce, D2L).

Familiarity with data governance and metadata tools.

Preferred Qualifications

Experience with Unity Catalog.

Experience deploying ML models using MLflow.

Familiarity with Azure or AWS cloud environments.

Knowledge of data warehouse design principles.

#J-18808-Ljbffr