Anblicks
Key Responsibilities:
Lead the architecture, design, and implementation of scalable data pipelines using PySpark and Snowflake.
Migrate and modernize legacy ETL/ELT processes to cloud-native solutions.
Design Snowflake data models optimized for performance, scalability, and cost-efficiency.
Develop and maintain PySpark-based data processing frameworks for large-scale batch and streaming data.
Implement best practices for data ingestion, transformation, and orchestration using tools like Airflow, DBT, or Matillion.
Collaborate with data architects, analysts, and business stakeholders to understand data requirements and deliver solutions.
Ensure data quality, lineage, and governance through robust validation and monitoring frameworks.
Optimize Snowflake performance through clustering, caching, and workload management.
Enforce data security and compliance using RBAC, masking, and encryption techniques.
Lead code reviews, mentor junior engineers, and contribute to a culture of engineering excellence.
Required Skills & Experience:
10+ years of experience in data engineering with at least 3 years in a lead role. Strong hands-on experience with
Snowflake
including architecture, data modeling, performance tuning, and advanced features. Proficiency in
PySpark
for building distributed data processing pipelines. Expertise in
SQL
and handling semi-structured data formats. Experience with
cloud platforms
(AWS, Azure, or GCP) and services like S3, Lambda, Step Functions. Familiarity with
CI/CD , Git-based workflows, and DevOps practices. Experience with
data orchestration tools
such as Apache Airflow, DBT, or Matillion. Strong understanding of
data governance , security, and compliance frameworks. Excellent communication and leadership skills.
10+ years of experience in data engineering with at least 3 years in a lead role. Strong hands-on experience with
Snowflake
including architecture, data modeling, performance tuning, and advanced features. Proficiency in
PySpark
for building distributed data processing pipelines. Expertise in
SQL
and handling semi-structured data formats. Experience with
cloud platforms
(AWS, Azure, or GCP) and services like S3, Lambda, Step Functions. Familiarity with
CI/CD , Git-based workflows, and DevOps practices. Experience with
data orchestration tools
such as Apache Airflow, DBT, or Matillion. Strong understanding of
data governance , security, and compliance frameworks. Excellent communication and leadership skills.