Purple Drive

Data Engineer

Purple Drive, Pittsburgh, Pennsylvania, us, 15289

Data Engineer

Location : Pittsburgh, PA

Job Description

We are seeking a skilled

Data Engineer

with strong expertise in

Databricks, PySpark, Python, and SQL

to design, develop, and maintain scalable data pipelines for advanced analytics and business intelligence. The ideal candidate will have hands-on experience working with cloud-based platforms, modern data lakehouse architectures, and distributed data processing systems. You will collaborate with cross-functional teams to deliver high-performance, reliable, and secure data solutions.

Key Responsibilities

Design, build, and maintain

data pipelines

using

Databricks and PySpark . Develop

ETL/ELT workflows

for structured, semi-structured, and unstructured data. Write and optimize

Python and SQL scripts

for data transformation, validation, and reporting. Implement

data lakehouse solutions

for scalable storage and analytics. Ensure

data quality, governance, and lineage tracking

across pipelines. Collaborate with analysts, data scientists, and business teams to deliver business-ready datasets. Deploy and maintain pipelines in

cloud environments

(AWS / Azure / GCP). Monitor and troubleshoot data workflows, ensuring high availability and performance. Mandatory Skills

Databricks

- strong hands-on experience with data engineering and ML workflows. PySpark

- expertise in distributed data processing. Python

- proficient in scripting, automation, and ETL development. SQL

- strong ability to write complex queries, optimize performance, and work with large datasets. Cloud Platforms

- AWS / Azure / GCP (experience with storage, compute, and orchestration services). Data Warehousing

- knowledge of Redshift, Snowflake, or Synapse (preferred). Data Governance & Quality

- experience with schema validation, testing frameworks. Collaboration

- ability to work in Agile teams and communicate effectively with stakeholders.