Cliff Services Inc
W2 only--Data Engineer--F2F Interview (No C2C)
Cliff Services Inc, Richmond, Virginia, United States, 23214
Job Title
Data Engineers Type
Onsite (Hybrid 3 to 4 days to office) Interview
In Person Locations
McLean VA, Richmond VA, Dallas TX Job Description
A Data Engineer with Python, PySpark, and AWS expertise is responsible for designing, building, and maintaining scalable and efficient data pipelines in cloud environment. Responsibilities
Design, develop, and maintain robust ETL/ELT pipelines using Python and PySpark for data ingestion, transformation, and processing. Work extensively with AWS cloud services such as S3, Glue, EMR, Lambda, Redshift, Athena, and DynamoDB for data storage, processing, and warehousing. Build and optimize data ingestion and processing frameworks for large-scale data sets, ensuring data quality, consistency, and accuracy. Collaborate with data architects, data scientists, and business intelligence teams to understand data requirements and deliver effective data solutions. Implement data governance, lineage, and security best practices within data pipelines and infrastructure. Automate data workflows and improve data pipeline performance through optimization and tuning. Develop and maintain documentation for data solutions, including data dictionaries, lineage, and technical specifications. Participate in code reviews, contribute to continuous improvement initiatives, and troubleshoot complex data and pipeline issues. Required Skills
Strong programming proficiency in Python, including libraries like Pandas and extensive experience with PySpark for distributed data processing. Solid understanding and practical experience with Apache Spark/PySpark for large-scale data transformations. Demonstrated experience with AWS data services, including S3, Glue, EMR, Lambda, Redshift, and Athena. Proficiency in SQL and a strong understanding of data modeling, schema design, and data warehousing concepts. Experience with workflow orchestration tools such as Apache Airflow or AWS Step Functions. Familiarity with CI/CD pipelines and version control systems (e.g., Git). Excellent problem-solving, analytical, and communication skills, with the ability to work effectively in a team environment. Preferred Skills
Experience with streaming frameworks like Kafka or Kinesis. Knowledge of other data warehousing solutions like Snowflake. Seniority level
Mid-Senior level Employment type
Contract Job function
Analyst Industries
Banking Contact
K Hemanth | Recruitment Specialist Thanks & regards
#J-18808-Ljbffr
Data Engineers Type
Onsite (Hybrid 3 to 4 days to office) Interview
In Person Locations
McLean VA, Richmond VA, Dallas TX Job Description
A Data Engineer with Python, PySpark, and AWS expertise is responsible for designing, building, and maintaining scalable and efficient data pipelines in cloud environment. Responsibilities
Design, develop, and maintain robust ETL/ELT pipelines using Python and PySpark for data ingestion, transformation, and processing. Work extensively with AWS cloud services such as S3, Glue, EMR, Lambda, Redshift, Athena, and DynamoDB for data storage, processing, and warehousing. Build and optimize data ingestion and processing frameworks for large-scale data sets, ensuring data quality, consistency, and accuracy. Collaborate with data architects, data scientists, and business intelligence teams to understand data requirements and deliver effective data solutions. Implement data governance, lineage, and security best practices within data pipelines and infrastructure. Automate data workflows and improve data pipeline performance through optimization and tuning. Develop and maintain documentation for data solutions, including data dictionaries, lineage, and technical specifications. Participate in code reviews, contribute to continuous improvement initiatives, and troubleshoot complex data and pipeline issues. Required Skills
Strong programming proficiency in Python, including libraries like Pandas and extensive experience with PySpark for distributed data processing. Solid understanding and practical experience with Apache Spark/PySpark for large-scale data transformations. Demonstrated experience with AWS data services, including S3, Glue, EMR, Lambda, Redshift, and Athena. Proficiency in SQL and a strong understanding of data modeling, schema design, and data warehousing concepts. Experience with workflow orchestration tools such as Apache Airflow or AWS Step Functions. Familiarity with CI/CD pipelines and version control systems (e.g., Git). Excellent problem-solving, analytical, and communication skills, with the ability to work effectively in a team environment. Preferred Skills
Experience with streaming frameworks like Kafka or Kinesis. Knowledge of other data warehousing solutions like Snowflake. Seniority level
Mid-Senior level Employment type
Contract Job function
Analyst Industries
Banking Contact
K Hemanth | Recruitment Specialist Thanks & regards
#J-18808-Ljbffr