Purple Drive

Databricks Data Engineer

Purple Drive, Seattle, Washington, us, 98127

POSITION SUMMARY

We are seeking an experienced Data Engineer specializing in Databricks platform for a 6-month contract position based in Seattle, WA. This role focuses on designing, building, and deploying robust data pipelines and ETL processes within a cloud data platform environment. The ideal candidate will have strong expertise in Databricks, data modeling, and cross-functional collaboration to support enterprise data governance initiatives.

RESPONSIBILITIES

DATA PIPELINE DEVELOPMENT

Design, build, and deploy data extraction, transformation, and loading (ETL) processes and pipelines Extract data from various sources including databases, APIs, and data files Develop and maintain scalable data pipelines within Databricks Cloud Data Platform Ensure data quality, reliability, and performance across all pipeline processes Implement data validation and monitoring mechanisms DATABRICKS PLATFORM MANAGEMENT

Build comprehensive data models that reflect domain expertise and meet current business needs Ensure data models remain flexible and adaptable as business strategy evolves Monitor and optimize Databricks cluster performance for cost-effective scaling and resource utilization Implement and maintain Delta Lake for optimized data storage, ensuring data reliability, performance, and versioning Leverage Databricks Unity Catalog for data governance and collaboration DEVOPS & AUTOMATION

Automate CI/CD pipelines for data workflows using Azure DevOps Implement version control and deployment strategies for data pipelines Ensure automated testing and quality assurance for data processes Maintain infrastructure as code practices for data platform components DATA ARCHITECTURE & STORAGE

Demonstrate expertise in database storage concepts including data lakes, relational databases, NoSQL, Graph databases, and data warehousing Design and implement efficient data storage solutions Optimize data access patterns and query performance Ensure proper data partitioning and indexing strategies COLLABORATION & COMMUNICATION

Collaborate with cross-functional teams to support data governance initiatives Communicate technical concepts effectively to both technical and non-technical audiences Work with business stakeholders to understand data requirements Provide documentation and knowledge transfer for data solutions QUALIFICATIONS

REQUIRED EXPERIENCE

5+ years

of experience in data engineering and ETL development 3+ years

of hands-on experience with Databricks platform Strong experience with cloud data platforms (Azure, AWS, or GCP) Proven track record in building and maintaining data pipelines at scale TECHNICAL SKILLS

Databricks Expertise:

Databricks workspace and cluster management Delta Lake implementation and optimization Databricks Unity Catalog Spark and PySpark programming

Programming & Development:

Strong coding skills in Python, Scala, or SQL Experience with data transformation and data quality frameworks Knowledge of data integration patterns and best practices

Database & Storage:

Data lake architecture and design Relational databases (SQL Server, PostgreSQL, etc.) NoSQL databases (MongoDB, Cassandra, etc.) Graph databases and data warehousing concepts

DevOps & Automation:

Azure DevOps or similar CI/CD tools Infrastructure as Code (Terraform, ARM templates) Git version control and branching strategies

SOFT SKILLS

Strong analytical and problem-solving abilities Excellent written and verbal communication skills Ability to explain complex technical concepts to diverse audiences Collaborative mindset for cross-functional team environments Detail-oriented with focus on data quality and reliability PREFERRED QUALIFICATIONS

Databricks certification (Data Engineer Associate/Professional) Experience with real-time data processing and streaming Knowledge of data governance frameworks and best practices Experience with data visualization tools (Power BI, Tableau) Background in Agile/Scrum development methodologies