Purple Drive
Title:
Data Engineer (PySpark / AWS)
Location:
Wilmington, DE - Onsite
Type:
Contract
Job Overview
We are seeking a highly skilled
Data Engineer
with strong expertise in
PySpark, ETL pipelines, and AWS services
to join our onsite team in Wilmington, DE. The role requires hands-on experience in designing, developing, and maintaining large-scale data processing solutions, while collaborating with cross-functional teams to deliver high-quality, scalable data systems.
Key Responsibilities
Design, build, and optimize
ETL and streaming data pipelines
using
PySpark . Leverage AWS services ( SQS, Lambda, S3, RDS ) to implement scalable and secure data solutions. Develop and maintain clean, efficient, and reusable code in
Python
and other programming languages. Collaborate with business and technical stakeholders to translate requirements into data engineering solutions. Ensure data integrity, governance, and compliance across all platforms. Conduct performance tuning, troubleshooting, and optimization of data pipelines. Document processes, workflows, and best practices for continuous improvement. Must-Have Skills
Extensive hands-on experience with
PySpark
and
ETL/Streaming Pipelines . Strong knowledge of
AWS services
(SQS, Lambda, S3, RDS). Advanced proficiency in at least one programming language ( Python preferred ). Preferred Skills
Proficiency with
Alteryx, Tableau, ThoughtSpot, and SQL . Deep understanding of advanced AWS concepts ( Glue, EMR ). Practical
cloud-native development experience . Good to Have
Experience with
Shell scripting
and
Unix systems . Proficiency in
Terraform
for infrastructure as code. Familiarity with
Large Language Models (LLM) , including
practical use case development with OpenAI Azure . Experience & Qualifications
5+ years of applied experience in
data engineering or software development . Formal training or certification in
software engineering concepts . Bachelor's or Master's degree in
Computer Science, Computer Engineering, Mathematics , or a related technical field.
Data Engineer (PySpark / AWS)
Location:
Wilmington, DE - Onsite
Type:
Contract
Job Overview
We are seeking a highly skilled
Data Engineer
with strong expertise in
PySpark, ETL pipelines, and AWS services
to join our onsite team in Wilmington, DE. The role requires hands-on experience in designing, developing, and maintaining large-scale data processing solutions, while collaborating with cross-functional teams to deliver high-quality, scalable data systems.
Key Responsibilities
Design, build, and optimize
ETL and streaming data pipelines
using
PySpark . Leverage AWS services ( SQS, Lambda, S3, RDS ) to implement scalable and secure data solutions. Develop and maintain clean, efficient, and reusable code in
Python
and other programming languages. Collaborate with business and technical stakeholders to translate requirements into data engineering solutions. Ensure data integrity, governance, and compliance across all platforms. Conduct performance tuning, troubleshooting, and optimization of data pipelines. Document processes, workflows, and best practices for continuous improvement. Must-Have Skills
Extensive hands-on experience with
PySpark
and
ETL/Streaming Pipelines . Strong knowledge of
AWS services
(SQS, Lambda, S3, RDS). Advanced proficiency in at least one programming language ( Python preferred ). Preferred Skills
Proficiency with
Alteryx, Tableau, ThoughtSpot, and SQL . Deep understanding of advanced AWS concepts ( Glue, EMR ). Practical
cloud-native development experience . Good to Have
Experience with
Shell scripting
and
Unix systems . Proficiency in
Terraform
for infrastructure as code. Familiarity with
Large Language Models (LLM) , including
practical use case development with OpenAI Azure . Experience & Qualifications
5+ years of applied experience in
data engineering or software development . Formal training or certification in
software engineering concepts . Bachelor's or Master's degree in
Computer Science, Computer Engineering, Mathematics , or a related technical field.