Purple Drive
****************LOCAL ONLY*******************
Role Overview
We are seeking a highly skilled
Senior Data Engineer
with strong expertise in building scalable, secure, and high-performance data solutions on AWS. The ideal candidate should have hands-on experience with
Databricks (DBX) , distributed data processing, cloud-native frameworks, and automation using
Terraform .
Key Responsibilities
Design, develop, and optimize
data pipelines
on AWS leveraging EMR, EKS, and Databricks. Implement Infrastructure as Code (IaC) using
Terraform
to automate cloud deployments. Develop scalable solutions using
Scala, Python, and Java
for big data processing. Integrate structured, semi-structured, and unstructured data sources into the data platform. Collaborate with data scientists, analysts, and application teams to deliver end-to-end solutions. Ensure platform security, reliability, cost optimization, and adherence to best practices. Implement CI/CD pipelines and monitoring for data workflows. Required Skills
AWS Cloud Services:
EMR, EKS, S3, IAM, Lambda, Glue, VPC. Databricks (DBX):
Delta Lake, Spark, MLflow (nice to have). Programming:
Strong proficiency in
Scala, Python, Java . Infrastructure as Code (IaC):
Terraform (must-have). Data Engineering:
Spark, distributed computing, data pipelines, orchestration. DevOps/CI-CD:
Git, Jenkins, Docker, Kubernetes (EKS). Strong problem-solving, debugging, and optimization skills. Nice-to-Have Skills
Experience with
event-driven architectures
(Kafka, Kinesis). Familiarity with
data governance, lineage, and security frameworks . Knowledge of Agile methodologies and collaborative development practices.
Role Overview
We are seeking a highly skilled
Senior Data Engineer
with strong expertise in building scalable, secure, and high-performance data solutions on AWS. The ideal candidate should have hands-on experience with
Databricks (DBX) , distributed data processing, cloud-native frameworks, and automation using
Terraform .
Key Responsibilities
Design, develop, and optimize
data pipelines
on AWS leveraging EMR, EKS, and Databricks. Implement Infrastructure as Code (IaC) using
Terraform
to automate cloud deployments. Develop scalable solutions using
Scala, Python, and Java
for big data processing. Integrate structured, semi-structured, and unstructured data sources into the data platform. Collaborate with data scientists, analysts, and application teams to deliver end-to-end solutions. Ensure platform security, reliability, cost optimization, and adherence to best practices. Implement CI/CD pipelines and monitoring for data workflows. Required Skills
AWS Cloud Services:
EMR, EKS, S3, IAM, Lambda, Glue, VPC. Databricks (DBX):
Delta Lake, Spark, MLflow (nice to have). Programming:
Strong proficiency in
Scala, Python, Java . Infrastructure as Code (IaC):
Terraform (must-have). Data Engineering:
Spark, distributed computing, data pipelines, orchestration. DevOps/CI-CD:
Git, Jenkins, Docker, Kubernetes (EKS). Strong problem-solving, debugging, and optimization skills. Nice-to-Have Skills
Experience with
event-driven architectures
(Kafka, Kinesis). Familiarity with
data governance, lineage, and security frameworks . Knowledge of Agile methodologies and collaborative development practices.