Purple Drive
Job Title:
Senior Data Engineer - Python, PySpark, Apache Airflow & AWS
Role Overview
We are seeking a highly skilled Senior Data Engineer with extensive hands-on experience in Python, PySpark, Apache Airflow, and AWS services to design, build, and manage scalable data pipelines and architectures. The ideal candidate will have deep expertise in data migration, data warehousing, and cloud ETL/ELT solutions, with excellent communication skills to engage effectively with business and IT stakeholders.
Key Responsibilities
Design, develop, and maintain end-to-end data pipelines with
Python ,
PySpark , and
Apache Airflow
orchestration. Utilize AWS services such as
EKS, EMR, HashiCorp Vault, Glue, Docker, and Kubernetes
to build secure, scalable, and efficient data workflows. Lead development and migration efforts of Operational Data Stores, Enterprise Data Warehouses, Data Lakes, and Data Marts. Collaborate with business and IT teams to gather requirements and translate them into reliable data engineering solutions. Plan project execution and accurately estimate resource efforts to ensure timely delivery. Participate actively in Agile teams and promote Agile ways of working. Qualifications
Minimum
8+ years
of solid, hands-on experience with
Python, PySpark, Apache Airflow, and AWS services
(EKS, EMR, HashiCorp Vault, Glue, Docker, Kubernetes). Strong working knowledge of
SQL
and the
Data Warehousing lifecycle -an absolute requirement. Over
10 years
of overall experience in data migrations and building data stores including Operational Data Stores, EDW, Data Lakes, and Data Marts. Proven experience in designing and orchestrating data pipelines using
Apache Airflow . Experience with cloud-based ETL and ELT tools like
DBT, Glue, or EMR
is a plus. Excellent communication skills enabling effective collaboration with business and IT stakeholders. Expertise in project planning, execution, and effort estimation. Exposure to and experience working within
Agile
frameworks.
Senior Data Engineer - Python, PySpark, Apache Airflow & AWS
Role Overview
We are seeking a highly skilled Senior Data Engineer with extensive hands-on experience in Python, PySpark, Apache Airflow, and AWS services to design, build, and manage scalable data pipelines and architectures. The ideal candidate will have deep expertise in data migration, data warehousing, and cloud ETL/ELT solutions, with excellent communication skills to engage effectively with business and IT stakeholders.
Key Responsibilities
Design, develop, and maintain end-to-end data pipelines with
Python ,
PySpark , and
Apache Airflow
orchestration. Utilize AWS services such as
EKS, EMR, HashiCorp Vault, Glue, Docker, and Kubernetes
to build secure, scalable, and efficient data workflows. Lead development and migration efforts of Operational Data Stores, Enterprise Data Warehouses, Data Lakes, and Data Marts. Collaborate with business and IT teams to gather requirements and translate them into reliable data engineering solutions. Plan project execution and accurately estimate resource efforts to ensure timely delivery. Participate actively in Agile teams and promote Agile ways of working. Qualifications
Minimum
8+ years
of solid, hands-on experience with
Python, PySpark, Apache Airflow, and AWS services
(EKS, EMR, HashiCorp Vault, Glue, Docker, Kubernetes). Strong working knowledge of
SQL
and the
Data Warehousing lifecycle -an absolute requirement. Over
10 years
of overall experience in data migrations and building data stores including Operational Data Stores, EDW, Data Lakes, and Data Marts. Proven experience in designing and orchestrating data pipelines using
Apache Airflow . Experience with cloud-based ETL and ELT tools like
DBT, Glue, or EMR
is a plus. Excellent communication skills enabling effective collaboration with business and IT stakeholders. Expertise in project planning, execution, and effort estimation. Exposure to and experience working within
Agile
frameworks.