Allata
Join to apply for the
Data Architect (Databricks)
role at
Allata .
Allata is a global consulting and technology services firm with offices in the US, India, and Argentina. We help organizations accelerate growth, drive innovation, and solve complex challenges by combining strategy, design, and advanced technology. Our expertise covers defining business vision, optimizing processes, and creating engaging digital experiences. We architect and modernize secure, scalable solutions using cloud platforms and top engineering practices. Allata also empowers clients to unlock data value through analytics and visualization and leverages artificial intelligence to automate processes and enhance decision‑making.
Role & Responsibilities
Define the overall data platform architecture (Lakehouse/EDW), including reference patterns (Medallion, Lambda, Kappa), technology selection, and integration blueprint
Design conceptual, logical, and physical data models to support multi‑tenant and vertical‑specific data products; standardize logical layers (ingest/raw, staged/curated, serving)
Establish data governance, metadata, cataloging (e.g., Unity Catalog), lineage, data contracts, and classification practices to support analytics and ML use cases
Define security and compliance controls: access management (RBAC/IAM), data masking, encryption (in transit/at rest), network segmentation, and audit policies
Architect scalability, high availability, disaster recovery (RPO/RTO), and capacity & cost management strategies for cloud and hybrid deployments
Lead selection and integration of platform components such as Databricks, Delta Lake, Delta Live Tables, Fivetran, Azure Data Factory/Data Fabric, orchestration, monitoring/observability
Design and enforce CI/CD patterns for data artifacts (notebooks, packages, infra‑as‑code), including testing, automated deployments, and rollback strategies
Define ingestion patterns (batch & streaming), file compaction strategies, partitioning schemes, and storage layout to optimize IO and costs
Specify observability practices: metrics, SLAs, health dashboards, structured logging, tracing, and alerting for pipelines and jobs
Act as technical authority and mentor for Data Engineering teams; perform architecture and code reviews for critical components
Collaborate with stakeholders (Data Product Owners, Security, Infrastructure, BI, ML) to translate business requirements into technical solutions and roadmap
Design, develop, test, and deploy processing modules using Spark (PySpark/Scala), Spark SQL, and database stored procedures where applicable
Build and optimize data pipelines on Databricks and complementary engines (SQL Server, Azure SQL, AWS RDS/Aurora, PostgreSQL, Oracle)
Implement DevOps practices: infra‑as‑code, CI/CD pipelines (ingestion, transformation, tests, deployment), automated testing and version control
Troubleshoot and resolve complex data quality, performance, and availability issues; recommend and implement continuous improvements
Hard Skills – Must have
Previous experience as architect or lead technical role on enterprise data platforms
Hands‑on experience with Databricks technologies (Delta Lake, Unity Catalog, Delta Live Tables, Auto Loader, Structured Streaming)
Strong expertise in Spark (PySpark and/or Scala), Spark SQL and distributed job optimization
Solid background in data warehouse and lakehouse design; practical familiarity with Medallion/Lambda/Kappa patterns
Experience integrating SaaS/ETL/connectors (e.g., Fivetran), orchestration platforms (Airflow, Azure Data Factory, Data Fabric) and ELT/ETL tooling
Experience with relational and hybrid databases: MS SQL Server, PostgreSQL, Oracle, Azure SQL, AWS RDS/Aurora or equivalents
Proficiency in CI/CD for data pipelines (Azure DevOps, GitHub Actions, Jenkins, or similar) and packaging/deployment of artifacts (.whl, containers)
Experience with batch and streaming processing, file compaction, partitioning strategies and storage tuning
Good understanding of cloud security, IAM/RBAC, encryption, VPC/VNet concepts, and cloud networking
Familiarity with observability and monitoring tools (Prometheus, Grafana, Datadog, native cloud monitoring, or equivalent)
Hard Skills – Nice to have / It’s a plus
Automation experience with CI/CD pipelines to support deployment and integration workflows, including trunk‑based development using services such as Azure DevOps, Jenkins, Octopus
Advanced proficiency in PySpark for advanced data processing tasks
Advanced proficiency in Spark workflow optimization and orchestration using tools such as Asset Bundles or DAG orchestration
Certifications: Databricks Certified Data Engineer / Databricks Certified Professional Architect, cloud architect/data certifications (AWS/Azure/GCP)
Soft Skills / Business Specific Skills
Ability to identify, troubleshoot, and resolve complex data issues effectively
Strong teamwork, communication skills and intellectual curiosity to work collaboratively and effectively with cross‑functional teams
Commitment to delivering high‑quality, accurate, and reliable data product solutions
Willingness to embrace new tools, technologies, and methodologies
Innovative thinker with a proactive approach to overcoming challenges
At Allata, we value differences.
Allata is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Allata makes employment decisions without regard to race, color, creed, religion, age, ancestry, national origin, veteran status, sex, sexual orientation, gender, gender identity, gender expression, marital status, disability or any other legally protected category. This policy applies to all terms and conditions of employment, including but not limited to, recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
#J-18808-Ljbffr
Data Architect (Databricks)
role at
Allata .
Allata is a global consulting and technology services firm with offices in the US, India, and Argentina. We help organizations accelerate growth, drive innovation, and solve complex challenges by combining strategy, design, and advanced technology. Our expertise covers defining business vision, optimizing processes, and creating engaging digital experiences. We architect and modernize secure, scalable solutions using cloud platforms and top engineering practices. Allata also empowers clients to unlock data value through analytics and visualization and leverages artificial intelligence to automate processes and enhance decision‑making.
Role & Responsibilities
Define the overall data platform architecture (Lakehouse/EDW), including reference patterns (Medallion, Lambda, Kappa), technology selection, and integration blueprint
Design conceptual, logical, and physical data models to support multi‑tenant and vertical‑specific data products; standardize logical layers (ingest/raw, staged/curated, serving)
Establish data governance, metadata, cataloging (e.g., Unity Catalog), lineage, data contracts, and classification practices to support analytics and ML use cases
Define security and compliance controls: access management (RBAC/IAM), data masking, encryption (in transit/at rest), network segmentation, and audit policies
Architect scalability, high availability, disaster recovery (RPO/RTO), and capacity & cost management strategies for cloud and hybrid deployments
Lead selection and integration of platform components such as Databricks, Delta Lake, Delta Live Tables, Fivetran, Azure Data Factory/Data Fabric, orchestration, monitoring/observability
Design and enforce CI/CD patterns for data artifacts (notebooks, packages, infra‑as‑code), including testing, automated deployments, and rollback strategies
Define ingestion patterns (batch & streaming), file compaction strategies, partitioning schemes, and storage layout to optimize IO and costs
Specify observability practices: metrics, SLAs, health dashboards, structured logging, tracing, and alerting for pipelines and jobs
Act as technical authority and mentor for Data Engineering teams; perform architecture and code reviews for critical components
Collaborate with stakeholders (Data Product Owners, Security, Infrastructure, BI, ML) to translate business requirements into technical solutions and roadmap
Design, develop, test, and deploy processing modules using Spark (PySpark/Scala), Spark SQL, and database stored procedures where applicable
Build and optimize data pipelines on Databricks and complementary engines (SQL Server, Azure SQL, AWS RDS/Aurora, PostgreSQL, Oracle)
Implement DevOps practices: infra‑as‑code, CI/CD pipelines (ingestion, transformation, tests, deployment), automated testing and version control
Troubleshoot and resolve complex data quality, performance, and availability issues; recommend and implement continuous improvements
Hard Skills – Must have
Previous experience as architect or lead technical role on enterprise data platforms
Hands‑on experience with Databricks technologies (Delta Lake, Unity Catalog, Delta Live Tables, Auto Loader, Structured Streaming)
Strong expertise in Spark (PySpark and/or Scala), Spark SQL and distributed job optimization
Solid background in data warehouse and lakehouse design; practical familiarity with Medallion/Lambda/Kappa patterns
Experience integrating SaaS/ETL/connectors (e.g., Fivetran), orchestration platforms (Airflow, Azure Data Factory, Data Fabric) and ELT/ETL tooling
Experience with relational and hybrid databases: MS SQL Server, PostgreSQL, Oracle, Azure SQL, AWS RDS/Aurora or equivalents
Proficiency in CI/CD for data pipelines (Azure DevOps, GitHub Actions, Jenkins, or similar) and packaging/deployment of artifacts (.whl, containers)
Experience with batch and streaming processing, file compaction, partitioning strategies and storage tuning
Good understanding of cloud security, IAM/RBAC, encryption, VPC/VNet concepts, and cloud networking
Familiarity with observability and monitoring tools (Prometheus, Grafana, Datadog, native cloud monitoring, or equivalent)
Hard Skills – Nice to have / It’s a plus
Automation experience with CI/CD pipelines to support deployment and integration workflows, including trunk‑based development using services such as Azure DevOps, Jenkins, Octopus
Advanced proficiency in PySpark for advanced data processing tasks
Advanced proficiency in Spark workflow optimization and orchestration using tools such as Asset Bundles or DAG orchestration
Certifications: Databricks Certified Data Engineer / Databricks Certified Professional Architect, cloud architect/data certifications (AWS/Azure/GCP)
Soft Skills / Business Specific Skills
Ability to identify, troubleshoot, and resolve complex data issues effectively
Strong teamwork, communication skills and intellectual curiosity to work collaboratively and effectively with cross‑functional teams
Commitment to delivering high‑quality, accurate, and reliable data product solutions
Willingness to embrace new tools, technologies, and methodologies
Innovative thinker with a proactive approach to overcoming challenges
At Allata, we value differences.
Allata is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Allata makes employment decisions without regard to race, color, creed, religion, age, ancestry, national origin, veteran status, sex, sexual orientation, gender, gender identity, gender expression, marital status, disability or any other legally protected category. This policy applies to all terms and conditions of employment, including but not limited to, recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
#J-18808-Ljbffr