ZipRecruiter
Job DescriptionJob Description
We9re seeking a visionary Data Architect & with deep expertise in Databricks & to lead the design, implementation, and optimization of our enterprise data architecture. You9ll be instrumental in shaping scalable data solutions that empower analytics, AI, and business intelligence across the organization.
If you thrive in a fast-paced environment, love solving complex data challenges, and have a passion for cloud- platforms like AWS Databricks , we want to hear from you.
Responsibilities
- Design Architecture: Architect scalable, secure Lakehouse and data platform solutions using Databricks , Spark , Delta Lake , and cloud storage.
- Implementation Development: Lead implementation of ETL/ELT pipelines (batch real-time), Databricks notebooks, jobs, Structured Streaming, and PySpark/Scala code for production workloads.
- Data Modeling Pipelines: Define canonical data models, schema evolution strategies, and optimized ingestion patterns to support analytics, BI, and ML use cases.
- Performance Cost Optimization: Tune Spark jobs, Delta tables, cluster sizing, and storage to balance performance, latency, and cost-efficiency.
- Governance Compliance: Implement data quality, lineage, access controls, and compliance measures (Unity Catalog, RBAC) to meet internal and regulatory standards.
- Migration Modernization: Lead migration from legacy warehouses to cloud platforms and modern data architectures (Databricks, Snowflake), ensuring minimal disruption.
- DevOps Automation: Define CI/CD and Infrastructure as Code best practices for data platform deployments (Terraform, CI pipelines, Databricks Jobs API).
- Leadership Mentorship: Mentor engineers, drive architecture reviews, and collaborate with stakeholders to translate business needs into technical solutions.
Required Skills Qualifications
- 12+ years in data architecture with 5+ years hands-on experience in Databricks or equivalent Lakehouse platforms.
- Strong experience with Snowflake and modern data warehousing patterns.
- Cloud Platform: Proven experience on AWS (preferably with AWS Databricks) — familiarity with S3, IAM, Glue, and networking for secure deployments.
- Core Technologies: Deep proficiency in Apache Spark , Delta Lake , PySpark /Scala, SQL, and performance tuning.
- Data Engineering: Experience designing ETL/ELT pipelines, data modeling, partitioning strategies, and data quality frameworks.
- Automation DevOps: Familiarity with CI/CD, Terraform (or other IaC), Databricks Jobs API, and pipeline orchestration tools (Airflow, Prefect, dbt).
- Governance Security: Knowledge of data governance, metadata management, lineage, RBAC, and regulatory compliance best practices.
- Communication: Excellent stakeholder management, documentation, and cross-functional collaboration skills.
Qualifications
- Databricks Certified Data Engineer or Architect.
- Experience with MLflow, Unity Catalog, and Lakehouse architecture.
- Background in machine learning, AI, or advanced analytics.
- Experience with tools like Apache Airflow, dbt, or Power BI/Tableau