Logo
Zapcom

Data Engineer

Zapcom, Boston, Massachusetts, us, 02298

Save Job

Responsibilities

Design and deploy scalable ETL / ELT pipelines to ingest, transform, and load clinical data from diverse sources (EMRs, labs, IoT devices, data lakes, FHIR/HL7 APIs), including Azure and Snowflake.

Architect and optimize Microsoft Azure and Snowflake environments for clinical data storage, extraction, transformation, and loading, machine learning operations (MLOps) performance tuning, cost management, and secure data sharing.

Ensure compliance with healthcare regulations (HIPAA, GDPR) by implementing data anonymization, encryption, and audit trails.

Collaborate with clinical stakeholders to translate business requirements into technical solutions for analytics and reporting.

Develop and maintain data governance frameworks, including metadata management, data lineage, and quality checks (e.g., validation of lab results, patient demographics).

Automate data pipelines using orchestration tools (Apache Airflow, Prefect) and integrate real‑time streaming solutions (Kafka) where applicable.

Build and maintain documentation for data models, pipelines, and processes to ensure reproducibility and transparency.

Skills & Expertise

Advanced proficiency in Snowflake (Snowpipe, Time Travel, Zero‑Copy Cloning) and SQL for complex transformations.

Hands‑on experience with ETL / ELT tools (Apache Spark, AWS Glue, Azure Data Factory) and cloud platforms (AWS, Azure, GCP).

Strong programming skills in Python/Scala (Pandas, PySpark) for data scripting and automation.

Familiarity with healthcare data formats (OMOP, FHIR, HL7, DICOM) and clinical workflows.

Expertise in federated learning and running large jobs on high‑performance computing servers.

Data Governance: Ability to implement data quality frameworks (e.g., Great Expectations) and metadata management tools.

Regulatory Compliance: Proven experience securing PHI/PII data and adhering to HIPAA/GDPR requirements.

Problem‑Solving: Ability to troubleshoot pipeline failures, optimize query performance, and resolve data discrepancies.

Collaboration: Strong communication skills to work with cross‑functional teams (clinicians, analysts, IT).

Requirements

Snowflake SnowPro Core Certification (or higher).

AWS/Azure/GCP Data Engineering Certification (e.g., AWS Certified Data Analytics, Azure Data Engineer Associate).

Running jobs on high‑performance computing servers.

Healthcare‑specific certifications (e.g., HL7 FHIR, Certified Health Data Analyst (CHDA)).

Security certifications (CISSP, CIPP) for handling sensitive clinical data.

3+ years of experience in data engineering, with 2+ years focused on healthcare/clinical data (e.g., hospitals, EMR systems, clinical trials).

2+ years of hands‑on experience with Snowflake in production environments.

Proven track record of building ETL pipelines for large‑scale clinical datasets.

Experience with OMOP CDM, Epic/Cerner EHR systems, or clinical data lakes.

Exposure to DevOps practices (CI/CD, Terraform) and Agile methodologies.

Some front‑end development experience is preferred.

#J-18808-Ljbffr