Logo
Purple Drive

Senior Data Engineer

Purple Drive, Columbus, Ohio, United States, 43224

Save Job

Role Overview

We are seeking a

Data Engineer

with strong expertise in

Databricks, Python, and PySpark , coupled with experience in CI/CD pipelines. The ideal candidate will have a solid background in

data management, data warehousing, and data integration , with proven experience in developing scalable, high-performance data solutions on cloud platforms.

Key Responsibilities

Design, build, and optimize

data pipelines

using Databricks, PySpark, and Python. Develop and maintain

data quality rules, transformations, and mappings

to ensure data accuracy and consistency. Write and optimize

complex SQL queries

for large-scale data processing. Work with

cloud platforms

(AWS/Azure/GCP) to deliver secure, scalable solutions. Support

data integration, data warehousing, and cleansing

initiatives. Collaborate with cross-functional teams to deliver high-quality solutions following

Agile Scrum

practices. Troubleshoot production issues in

Oracle and MS SQL Server

environments and drive timely resolution. Implement

CI/CD pipelines

for continuous integration, testing, and deployment. Follow best practices for data management and software development lifecycle (SDLC). Required Skills & Experience

6-8 years

of overall IT/data engineering experience. Must have:

Databricks, Python, PySpark, and CI/CD experience. 3-5 years

of experience in data management, warehousing, integration, and cleansing. Strong

SQL programming

skills (Oracle and MS SQL preferred). 2+ years

of experience with Python (Perl is a plus). 3+ years

working with cloud technologies (AWS, Azure, or GCP). Strong analytical, problem-solving, and troubleshooting skills. Experience in

Agile Scrum

delivery and

SDLC processes . Nice-to-Have Skills

Experience with

data governance and metadata management . Exposure to

ETL tools

and orchestration frameworks (Airflow, ADF, etc.). Knowledge of

DevOps practices and containerization

(Docker, Kubernetes).