iSoftTek Solutions Inc
AWS Data Engineer with Sage maker
iSoftTek Solutions Inc, Reston, Virginia, United States, 22090
Overview
Role: AWS Data Engineer with Sage maker Location : Reston, VA (Day 1 onsite – 5 days in a week) We are seeking a motivated Senior Data Engineer with strong AWS and data platform experience to design, build, and operationalize scalable data processing and ML-ready pipelines. The ideal candidate will be hands-on with PySpark, RedShift, Glue, and automation using CI/CD and scripting. Responsibilities
Design and implement scalable ETL/ELT pipelines on AWS for batch and near-real-time workloads. Build and optimize data processing jobs using PySpark on EMR and Glue. Develop and manage RedShift schemas, queries, and Spectrum for external table access. Integrate machine learning workflows with SageMaker and Lambda-driven orchestration. Automate deployments and testing using CI/CD tools and source control (Jenkins, UCD, Bitbucket, GitHub). Create and maintain operational scripts and tooling (Shell, Python) for monitoring, troubleshooting, and performance tuning. Must-have Skills
AWS services (EMR/SageMaker , Lambda, RedShift, Glue, SNS, SQS) PySpark and data processing frameworks Shell scripting and Python development CI/CD tooling experience (Jenkins, UCD) Source control experience with Bitbucket and GitHub Experience building and maintaining scripts/tools for automation Nice-to-have
Familiarity with AWS ECS Experience with Aurora PostgreSQL Java for tooling or pipeline components
#J-18808-Ljbffr
Role: AWS Data Engineer with Sage maker Location : Reston, VA (Day 1 onsite – 5 days in a week) We are seeking a motivated Senior Data Engineer with strong AWS and data platform experience to design, build, and operationalize scalable data processing and ML-ready pipelines. The ideal candidate will be hands-on with PySpark, RedShift, Glue, and automation using CI/CD and scripting. Responsibilities
Design and implement scalable ETL/ELT pipelines on AWS for batch and near-real-time workloads. Build and optimize data processing jobs using PySpark on EMR and Glue. Develop and manage RedShift schemas, queries, and Spectrum for external table access. Integrate machine learning workflows with SageMaker and Lambda-driven orchestration. Automate deployments and testing using CI/CD tools and source control (Jenkins, UCD, Bitbucket, GitHub). Create and maintain operational scripts and tooling (Shell, Python) for monitoring, troubleshooting, and performance tuning. Must-have Skills
AWS services (EMR/SageMaker , Lambda, RedShift, Glue, SNS, SQS) PySpark and data processing frameworks Shell scripting and Python development CI/CD tooling experience (Jenkins, UCD) Source control experience with Bitbucket and GitHub Experience building and maintaining scripts/tools for automation Nice-to-have
Familiarity with AWS ECS Experience with Aurora PostgreSQL Java for tooling or pipeline components
#J-18808-Ljbffr