Logo
Mastech Digital

Lead Data Engineer-Apache Iceberg

Mastech Digital, Strongsville, Ohio, United States, 44136

Save Job

Lead the migration of datasets and ETL workflows from Cloudera Hadoop (Hive, Impala, HDFS, etc.) to an Apache Iceberg based architecture. Analyze existing data pipelines and storage formats (e.g., Parquet, ORC) to plan and execute a smooth migration strategy. Design and implement scalable data ingestion and transformation pipelines using Apache Spark, Flink, or equivalent tools. Optimize data partitioning, schema evolution, compaction, and metadata management using Iceberg best practices. Integrate Iceberg tables with query engines like Trino or Presto to support data analytics use cases. Ensure compatibility and data quality during the migration phase through robust testing, validation, and lineage tracking. Establish monitoring, logging, and performance tuning for migrated pipelines and Iceberg tables. Seniority level

Mid-Senior level Employment type

Contract Job function

Information Technology Industries

IT Services and IT Consulting

#J-18808-Ljbffr