ZipRecruiter
Job DescriptionJob Description Job Title:
Senior/Lead Data Engineer Location:
Pittsburgh, PA/ Dallas, TX/ Cleveland,OH Client:
PNC Bank Job Description:
PNC Bank is seeking a highly skilled
Senior/Lead Data Engineer
to join its Data & Analytics team. The ideal candidate will have strong expertise in building and optimizing scalable data pipelines using Hadoop and PySpark, with a focus on enabling data-driven decision-making across the enterprise. This role requires hands-on technical leadership, deep knowledge of big data technologies, and the ability to collaborate with cross-functional teams. Key Responsibilities: Design, build, and optimize
large-scale data pipelines
for structured and unstructured data.Develop and maintain
data ingestion, transformation, and integration workflows
using
PySpark, Hadoop (HDFS, Hive, HBase), and related ecosystem tools .Implement
best practices for data modeling, performance tuning, and pipeline optimization .Collaborate with data scientists, analysts, and business stakeholders to deliver reliable and scalable data solutions.Ensure data quality, security, and governance across data platforms.Provide
technical leadership and mentoring
to junior data engineers.Partner with cloud teams to integrate Hadoop/PySpark workloads with
cloud platforms (AWS/Azure/GCP) .Troubleshoot, monitor, and optimize ETL workflows for high availability and performance.Drive innovation by evaluating and recommending
emerging technologies
in the big data ecosystem. Required Skills & Qualifications:
10+ years
of experience in Data Engineering, with at least
5+ years in Hadoop & PySpark .Strong expertise in
Hadoop ecosystem
(HDFS, Hive, HBase, Oozie, Sqoop, Kafka, etc.).Proficiency in
PySpark, Python, and SQL
for data transformation and analytics.Hands-on experience with
ETL/ELT processes, data modeling, and performance tuning .Familiarity with
data governance, lineage, and security best practices .Experience integrating Hadoop with
cloud platforms (AWS EMR, Azure Databricks, GCP Dataproc, etc.) .Strong understanding of
distributed computing, parallel processing, and big data architecture .Excellent problem-solving, communication, and leadership skills.
Qualifications: Experience in the
banking or financial services industry .Exposure to
streaming technologies
(Kafka, Spark Streaming, Flink).Familiarity with
DevOps, CI/CD pipelines, and containerization (Docker/Kubernetes) .Knowledge of
Snowflake or other modern data warehouses
is a plus.
Senior/Lead Data Engineer Location:
Pittsburgh, PA/ Dallas, TX/ Cleveland,OH Client:
PNC Bank Job Description:
PNC Bank is seeking a highly skilled
Senior/Lead Data Engineer
to join its Data & Analytics team. The ideal candidate will have strong expertise in building and optimizing scalable data pipelines using Hadoop and PySpark, with a focus on enabling data-driven decision-making across the enterprise. This role requires hands-on technical leadership, deep knowledge of big data technologies, and the ability to collaborate with cross-functional teams. Key Responsibilities: Design, build, and optimize
large-scale data pipelines
for structured and unstructured data.Develop and maintain
data ingestion, transformation, and integration workflows
using
PySpark, Hadoop (HDFS, Hive, HBase), and related ecosystem tools .Implement
best practices for data modeling, performance tuning, and pipeline optimization .Collaborate with data scientists, analysts, and business stakeholders to deliver reliable and scalable data solutions.Ensure data quality, security, and governance across data platforms.Provide
technical leadership and mentoring
to junior data engineers.Partner with cloud teams to integrate Hadoop/PySpark workloads with
cloud platforms (AWS/Azure/GCP) .Troubleshoot, monitor, and optimize ETL workflows for high availability and performance.Drive innovation by evaluating and recommending
emerging technologies
in the big data ecosystem. Required Skills & Qualifications:
10+ years
of experience in Data Engineering, with at least
5+ years in Hadoop & PySpark .Strong expertise in
Hadoop ecosystem
(HDFS, Hive, HBase, Oozie, Sqoop, Kafka, etc.).Proficiency in
PySpark, Python, and SQL
for data transformation and analytics.Hands-on experience with
ETL/ELT processes, data modeling, and performance tuning .Familiarity with
data governance, lineage, and security best practices .Experience integrating Hadoop with
cloud platforms (AWS EMR, Azure Databricks, GCP Dataproc, etc.) .Strong understanding of
distributed computing, parallel processing, and big data architecture .Excellent problem-solving, communication, and leadership skills.
Qualifications: Experience in the
banking or financial services industry .Exposure to
streaming technologies
(Kafka, Spark Streaming, Flink).Familiarity with
DevOps, CI/CD pipelines, and containerization (Docker/Kubernetes) .Knowledge of
Snowflake or other modern data warehouses
is a plus.