Logo
TCS USAAvance Consulting

Hadoop PySpark Developer

TCS USAAvance Consulting, Strongsville

Save Job

Skill: Hadoop PySpark Developer 

Must Have Technical/Functional Skills:

  • Cloudera Data Platform, PySpark, python ,Hive-Map Reduce, Linux /Unix, Impala , Big Data Technologies, Cloud Technologies.

Roles & Responsibilities:

  • Understand requirements/use cases and build efficient ETL solutions using Apache Spark, python, Kafka, Hive targeting Cloudera Data Platform.

  • Requirement/use case analysis and convert functional requirements into concrete technical tasks and able to provide reasonable effort estimates.

  • Work closely with Data analyst/modeler, Business User to understand the data requirement. Convert requirements to high-level , low-level design and, source-to- target documents.

  • Responsible to design , develop and schedule data pipelines which handle large volume of data within SLA. Work with solution architect, Technical Managers, Admins to understand SLAs , limitations of systems and provide efficient solutions.

  • Expertise in processing large volume of data aggregation using spark , must know different performance improvement technique and should lead teams on optimization.

  • Responsible to develop efficient data ingestion and data governance framework as per specification.

  • Performance improvement of existing spark-based data ingestion, aggregation pipelines to meet SLA.

  • Work proactively, independently with global teams to address project requirements, articulate issues/challenges with enough lead time to address project delivery risks.

  • Plan production implementation activities , execute change requests and resolve issues in production implementation.

  • Plan and execute large data migration, history data rebuild activities.

  • Code reviews/optimization, test case reviews . Demonstrate troubleshooting skill in resolving technical issues, bugs.

  • Demonstrate ownership and initiative. Ability to bring-in best practices /solutions which best fit for client problem and environment.