Boston Public Health Commission

Data Engineer - Casual

Boston Public Health Commission, Boston, Massachusetts, us, 02298

The Data Engineer - Casual will support the Boston Public Health Commission’s Data Modernization Initiative (DMI), focusing on building and maintaining data pipelines in Microsoft Azure platform (Azure Data Factory, Azure Data Lake Gen2), improving data quality, and supporting the development of BPHC’s Azure Data Lake. This role offers hands‑on experience with cloud data engineering, automation, write SQL or Python scripts to clean, transform, and validate datasets prior to storage in the Azure Data Lake and governance tools used to modernize public health systems.

Key Responsibilities

Assist in building and maintaining

ETL/ELT data pipelines

that load data into the BPHC Data Lake and Data Warehouse.

Help design and implement

data ingestion workflows

for structured and unstructured datasets from APIs, external systems, databases, flat files, and public data sources.

Support

Data Lake organization , including folder structures, metadata tagging, data partitioning, and schema alignment.

Maintain high‑quality data storage practices including

data versioning, lineage tracking, and format optimization

(e.g., Parquet, Delta).

Participate in building and optimizing

data models, tables, and views

used by dashboards and analytic systems.

Support data validation, quality checks, deduplication, and data cleaning processes.

Document data flows, pipeline logic, and transformations for the Data Lake environment.

Assist in automating ingestion and transformation processes using Python, SQL, and Azure‑based tools.

Collaborate with analysts and program teams to understand data needs and implement scalable solutions.

Qualifications

Foundational programming experience in SQL, Python, or R.

Basic understanding of data pipelines, ETL/ELT processes, and data modeling concepts.

Exposure to cloud platforms such as Microsoft Azure, AWS, or Google Cloud (Azure preferred).

Familiarity with tools like Azure Data Factory, Databricks, or Synapse Analytics is a plus.

Experience with version control (e.g., GitHub) and data visualization tools (e.g., Power BI or Tableau) preferred.

Awareness of data security, privacy, and governance principles (HIPAA, metadata standards, etc.) is a plus.

#J-18808-Ljbffr