Logo
ApTask

Apache Iceberg Data Lead

ApTask, Jersey City, New Jersey, United States, 07390

Save Job

bout Client: The client provides information technology (IT) services, including business outsourcing, infrastructure technology, and application services. The application service offered by the company includes application development, maintenance, and support. The markets served by the company are financial services and insurance, healthcare, manufacturing, government, transportation, communications, and consumer and retail industries.

Rate Range: $60-$65/Hr

Job Description: We are seeking a highly skilled and experienced Apache Iceberg Data Lead to design, implement, and manage our data lake infrastructure. You will be responsible for building a scalable and efficient data lake using Apache Iceberg, ensuring data reliability, performance, and accessibility for downstream analytics and reporting. You will work closely with our Flink stream application developers and data scientists to build a robust data platform. Responsibilities: Data Lake Architecture and Design:

Design and implement a scalable and robust data lake architecture using Apache Iceberg. Define data lake best practices, including data partitioning, clustering, and versioning. Develop and maintain data lake schemas and metadata. Integrate Apache Iceberg with other data lake components (e.g., storage systems, compute engines). Iceberg Implementation and Management:

Implement and manage Apache Iceberg tables for both raw source data and processed Flink output. Optimize Iceberg performance for various query patterns. Ensure data quality and consistency within the data lake. Manage Iceberg table evolution and schema changes. Implement data retention and archival policies. Integration with Flink and Other Data Systems:

Design and implement seamless integration between Apache Flink and Apache Iceberg for data ingestion and storage. Work with Flink developers to ensure efficient data writing to Iceberg tables. Integrate Iceberg with other data processing and analytics tools (e.g., Spark, Presto, Trino). Work with message queues like Kafka to ingest data into iceberg. Performance and Optimization:

Monitor and optimize data lake performance. Troubleshoot and resolve data lake performance and stability issues. Conduct performance testing and benchmarking. Data Governance and Security:

Implement data governance policies within the data lake. Ensure data security and access control. Implement data lineage and audit trails. Technical Leadership:

Provide technical leadership and guidance on Apache Iceberg and data lake best practices. Mentor junior engineers and contribute to knowledge sharing. Stay up-to-date with the latest developments in Apache Iceberg and data lake technologies. Qualifications: Required:

Bachelor's or Master's degree in Computer Science, Engineering, or a related field. 7+ years of experience in data engineering or data warehousing. 3+ years of hands-on experience with Apache Iceberg. Strong understanding of data lake architectures and best practices. Proficiency in SQL and experience with data processing frameworks (e.g., Spark, Flink). Experience with cloud storage systems (e.g., AWS S3, Azure Blob Storage, Google Cloud Storage). Experience with message queues like Kafka. Strong problem-solving and analytical skills. Excellent communication and collaboration skills. Preferred:

Experience with other data lake technologies (e.g., Apache Hudi, Delta Lake). Experience with metadata management tools. Experience with data governance and security tools. Experience with containerization and orchestration technologies ( Docker, Kubernetes). Contributions to open source projects.

About ApTask: ApTask is a leading global provider of workforce solutions and talent acquisition services, dedicated to shaping the future of work. As an African American-owned and Veteran-certified company, ApTask offers a comprehensive suite of services, including staffing and recruitment solutions, managed services, IT consulting, and project management. With a focus on excellence, collaboration, and innovation, ApTask provides unparalleled opportunities for professional growth and development. As a member of the ApTask team, you will have the chance to connect businesses with top-tier professionals, optimize workforce performance, and drive success across diverse industries. Join us at ApTask and be part of our mission to empower organizations to thrive while fostering a diverse and inclusive work environment.

Applicants may be required to attend interviews in person or by video conference. In addition, candidates may be required to present their current state or government issued ID during each interview.

Candidate Data Collection Disclaimer: At ApTask, we prioritize safeguarding your privacy. As part of our recruitment process, certain Personally Identifiable Information (PII) may be requested by our clients for verification and application purposes. Rest assured, we strictly adhere to confidentiality standards and comply with all relevant data protection laws. Please note that we only collect the necessary information as specified by each client and do not request sensitive details during the initial stages of recruitment.

If you have any concerns or queries about your personal information, please feel free to contact our compliance team at businessexcellence@aptask.com .

Applicant Consent: By submitting your application, you agree to ApTask's (www.aptask.com) Terms of Use nd Privacy Policy , and provide your consent to receive SMS and voice call communications regarding employment opportunities that match your resume and qualifications. You understand that your personal information will be used solely for recruitment purposes and that you can withdraw your consent at any time by contacting us at 732-355-8000 or help@aptask.com. Message frequency may vary. Msg & data rates may apply.