Capgemini
Location:
Atlanta, GA, Chicago, IL, New York, NY
Job Description We are seeking a highly skilled Big Data Integration Engineer with expertise in Informatica Big Data Management (BDM) and Microsoft HDInsight (Hadoop) to design, develop, and optimize large-scale data integration and processing solutions. The ideal candidate will work on building robust ETL workflows, managing distributed data platforms, and ensuring data quality and governance across big data ecosystems.
Key Responsibilities
Design and implement ETL workflows using Informatica BDM for big data environments.
Develop and maintain Hadoop clusters and data pipelines on Azure HDInsight.
Perform data integration, migration, and transformation across Hadoop and cloud platforms.
Optimize workflows for performance and scalability using Blaze engine, YARN, and Spark.
Collaborate with cross‑functional teams to define data requirements and architecture.
Troubleshoot and resolve data‑related issues, ensuring compliance and security standards.
Automate processes using Shell scripting, Python, or similar languages.
Work with Hadoop ecosystem tools (HDFS, Hive, Pig, MapReduce) and Informatica components.
Maintain documentation and adhere to data governance best practices.
Required Skills & Qualifications
6+ years experience with Informatica BDM and PowerCenter.
Strong knowledge of ETL, data modeling, SQL/Hive, and big data concepts.
Hands‑on experience with Hadoop ecosystem (YARN, Ambari, ZooKeeper).
Proficiency in Java/C#/Python and scripting for automation.
Familiarity with Azure HDInsight, Azure Storage, and cloud‑based big data solutions.
Good understanding of distributed systems, data security, and compliance.
Preferred Qualifications
Experience with Spark, NoSQL databases, and workflow orchestration tools (Oozie).
Exposure to containerization and orchestration (Docker, Kubernetes).
Contributions to open‑source big data technologies.
Soft Skills
Strong analytical and problem‑solving skills.
Excellent communication and collaboration abilities.
Ability to work in a fast‑paced, agile environment.
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology
Industries IT Services and IT Consulting
#J-18808-Ljbffr
Atlanta, GA, Chicago, IL, New York, NY
Job Description We are seeking a highly skilled Big Data Integration Engineer with expertise in Informatica Big Data Management (BDM) and Microsoft HDInsight (Hadoop) to design, develop, and optimize large-scale data integration and processing solutions. The ideal candidate will work on building robust ETL workflows, managing distributed data platforms, and ensuring data quality and governance across big data ecosystems.
Key Responsibilities
Design and implement ETL workflows using Informatica BDM for big data environments.
Develop and maintain Hadoop clusters and data pipelines on Azure HDInsight.
Perform data integration, migration, and transformation across Hadoop and cloud platforms.
Optimize workflows for performance and scalability using Blaze engine, YARN, and Spark.
Collaborate with cross‑functional teams to define data requirements and architecture.
Troubleshoot and resolve data‑related issues, ensuring compliance and security standards.
Automate processes using Shell scripting, Python, or similar languages.
Work with Hadoop ecosystem tools (HDFS, Hive, Pig, MapReduce) and Informatica components.
Maintain documentation and adhere to data governance best practices.
Required Skills & Qualifications
6+ years experience with Informatica BDM and PowerCenter.
Strong knowledge of ETL, data modeling, SQL/Hive, and big data concepts.
Hands‑on experience with Hadoop ecosystem (YARN, Ambari, ZooKeeper).
Proficiency in Java/C#/Python and scripting for automation.
Familiarity with Azure HDInsight, Azure Storage, and cloud‑based big data solutions.
Good understanding of distributed systems, data security, and compliance.
Preferred Qualifications
Experience with Spark, NoSQL databases, and workflow orchestration tools (Oozie).
Exposure to containerization and orchestration (Docker, Kubernetes).
Contributions to open‑source big data technologies.
Soft Skills
Strong analytical and problem‑solving skills.
Excellent communication and collaboration abilities.
Ability to work in a fast‑paced, agile environment.
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology
Industries IT Services and IT Consulting
#J-18808-Ljbffr