Logo
Kanak Elite Services

(Data Engineer)

Kanak Elite Services, Mc Lean, Virginia, us, 22107

Save Job

Overview

Data Engineer role at

Kanak Elite Services

(Location: McLean, VA – hybrid).

Contract . Tech stack

Python, PySpark, Node.js, Vue.js, AWS EMR/Glue, Lambda, S3, SNS, DynamoDB, Databricks Key Responsibilities

Design, build, test, deploy, and maintain end-to-end data pipelines (batch, micro batch, and streaming when needed) using PySpark, Python, and other tools. Use AWS services (EMR, Glue, Lambda, S3, SNS, DynamoDB, etc.) to orchestrate and manage data workflows. Integrate with Databricks environments as needed (e.g. for Spark workloads, notebooks, jobs). Ensure data ingestion from diverse sources (structured, semi-structured, unstructured) into data lake / warehouse layers. Optimize pipeline performance, reliability, scalability, and cost (e.g. tuning Spark jobs, partitioning strategies, caching, resource sizing). Implement data transformations, aggregations, and joins, ensuring correctness of logic and performance. Build monitoring, alerting, and logging for data pipelines and data platform components (e.g. CloudWatch, CloudTrail, custom dashboards). Ensure data quality, lineage, and accountability (validation, schema enforcement, error handling, retries). Collaborate with data scientists, BI/analytics teams, and application developers to understand data requirements and deliver usable datasets. Maintain and evolve data models, schemas, and metadata (e.g. catalog, data dictionary). Assist in architecture and design reviews; propose improvements and modernization (e.g. adopt new patterns, services, best practices). Participate in code reviews, documentation, and team knowledge sharing. Help with migrations, refactoring legacy ETL systems into cloud-native architecture when needed. Support deployment and CI/CD of data assets (infrastructure as code, versioning, automation). Troubleshoot production issues and performance bottlenecks, ensuring high availability and SLAs. Required Qualifications / Skills

3-7+ years of experience in a data engineering / software engineering role (or equivalent) with a strong focus on Big Data, ETL/ELT, and cloud infrastructure. Expert in Python and PySpark / Apache Spark for data processing. Hands-on experience with AWS data and compute services: EMR, Glue, Lambda, S3, SNS, DynamoDB (and optionally others like Athena, Kinesis, Step Functions). Experience integrating or working with Databricks (or similar managed Spark platform). Strong SQL skills and experience working with relational and NoSQL data systems. Experience designing efficient data partitioning, bucketing, join strategies, caching, and data-format optimization (e.g. Parquet, ORC, Delta). Ability to build and maintain data models / schemas, and understand star/snowflake, dimensional modeling, normalization/denormalization tradeoffs. Familiarity with orchestration and workflow tools (e.g. Airflow, AWS Step Functions, Glue workflows). Experience with CI/CD, version control (Git), and infrastructure-as-code (Terraform, CloudFormation). Good understanding of data governance, security, permissions, encryption, IAM, data lineage, auditing. Strong debugging, problem-solving, and performance tuning skills in distributed systems. Excellent communication, ability to interact with stakeholders and cross-functional teams. Ability to work independently in a hybrid setup and manage deliverables under tight deadlines. Nice-to-have / Bonus Skills

Experience with streaming platforms (Kafka, Kinesis, Pulsar, etc.). Familiarity with additional AWS services: Kinesis, EMR autoscaling, AWS Glue Catalog, Lake Formation, Athena. Experience with Delta Lake, Iceberg, or similar lakehouse architectures. Prior experience migrating from on-prem or legacy ETL to cloud-based architecture. Experience with containerization (Docker) or orchestration (Kubernetes). Experience with data science / ML infrastructure (feature store, model inference pipelines). Experience with additional languages/frameworks (Node.js, Vue.js) in context of data services / UI dashboards. Familiarity with DevOps practices: observability, logging, metrics, alerting frameworks. Prior government, defense, or federal contracting experience (security clearances, compliance). Application details

Note: Referrals increase your chances of interviewing. Rest of the content includes job postings and related information not part of the role. End of description.

#J-18808-Ljbffr