HEITMEYER CONSULTING INC
To Apply for this Job Click Here
Heitmeyer Consulting has a banking client that has a need within their Chief Data Office for a highly skilled Data Platform Engineer to lead the design, build, and operation of our real-time event-streaming platform. This role requires deep expertise in Apache Kafka and the AWS ecosystem, with a specific emphasis on leveraging native AWS services, S3 for storage, and Apache Iceberg as the open table format. The ideal candidate will establish the foundational framework, implement data governance policies, and define schema definitions to ensure a reliable, scalable, and secure data ecosystem for analytics and machine learning workloads. Role must be based in Dallas, TX or Tulsa, OK.
Top Required Skills:
Require extensive experience as a
Data Platform Engineer
or a similar role, with hands‑on experience in production environments.
Must have
in-depth knowledge of the Apache Kafka ecosystem , including brokers, topics, partitions, replicas, Kafka Connect, and Schema Registry concepts.
Significant cloud background with strong expertise in AWS cloud services related to data engineering (S3, MSK, Glue, Athena, EMR, DynamoDB, etc.).
Proficiency in programming/scripting languages such as Python, Java, or Scala.
Strong background in Data Lake Technologies that includes understanding and experience with S3 as the data lake storage layer along with hands‑on experience with Apache Iceberg and its features, including schema evolution and operational efficiencies for large datasets.
Familiarity with DevOps practices, CI/CD pipelines, and Infrastructure‑as‑Code (IaC) tools (e.g., Terraform, CloudFormation, GitLab CI, Argo CD).
Excellent problem‑solving skills and the ability to troubleshoot complex issues in distributed systems.
Nice‑to‑have:
Background within financial services would be preferred but not required.
Key Responsibilities
Design and implement robust, scalable, and fault‑tolerant event streaming architectures and end‑to‑end data pipelines using Apache Kafka and AWS services (e.g., MSK, S3, Glue, Athena, EMR).
Build and maintain the data platform infrastructure using Infrastructure‑as‑Code (IaC) tools (e.g., Terraform, CloudFormation), focusing on native S3 storage and Apache Iceberg table format for efficient data lake management.
Create Data Governance & Schema Definition:
Establish best practices for schema design, topic governance, data contracts, and message lifecycle management.
Implement and manage a Schema Registry (e.g., Confluent Schema Registry) to enforce message structure, validate schemas, and manage schema evolution without breaking downstream applications.
Define retention policies, naming standards, and ensure adherence to compliance, auditing, and data protection requirements (encryption at rest and in transit).
Ensure the performance, reliability, scalability, and high availability of the Kafka platform and data pipelines, including capacity planning, performance tuning, and optimization.
Implement comprehensive monitoring, logging, and alerting using enterprise observability tools (e.g., Prometheus, Grafana, CloudWatch).
Partner with data architects, software engineers, and data scientists to drive the adoption of event‑driven architectures and provide platform guidance and troubleshooting expertise to application teams.
Develop and maintain CI/CD pipelines for automated deployment of data platform resources and configurations.
Heitmeyer Consulting is an equal opportunity employer, and we encourage all qualified candidates to apply. Qualified applicants will be considered without regard to minority status, gender, disability, veteran status or any other characteristic protected by law.
To Apply for this Job Click Here #J-18808-Ljbffr
Top Required Skills:
Require extensive experience as a
Data Platform Engineer
or a similar role, with hands‑on experience in production environments.
Must have
in-depth knowledge of the Apache Kafka ecosystem , including brokers, topics, partitions, replicas, Kafka Connect, and Schema Registry concepts.
Significant cloud background with strong expertise in AWS cloud services related to data engineering (S3, MSK, Glue, Athena, EMR, DynamoDB, etc.).
Proficiency in programming/scripting languages such as Python, Java, or Scala.
Strong background in Data Lake Technologies that includes understanding and experience with S3 as the data lake storage layer along with hands‑on experience with Apache Iceberg and its features, including schema evolution and operational efficiencies for large datasets.
Familiarity with DevOps practices, CI/CD pipelines, and Infrastructure‑as‑Code (IaC) tools (e.g., Terraform, CloudFormation, GitLab CI, Argo CD).
Excellent problem‑solving skills and the ability to troubleshoot complex issues in distributed systems.
Nice‑to‑have:
Background within financial services would be preferred but not required.
Key Responsibilities
Design and implement robust, scalable, and fault‑tolerant event streaming architectures and end‑to‑end data pipelines using Apache Kafka and AWS services (e.g., MSK, S3, Glue, Athena, EMR).
Build and maintain the data platform infrastructure using Infrastructure‑as‑Code (IaC) tools (e.g., Terraform, CloudFormation), focusing on native S3 storage and Apache Iceberg table format for efficient data lake management.
Create Data Governance & Schema Definition:
Establish best practices for schema design, topic governance, data contracts, and message lifecycle management.
Implement and manage a Schema Registry (e.g., Confluent Schema Registry) to enforce message structure, validate schemas, and manage schema evolution without breaking downstream applications.
Define retention policies, naming standards, and ensure adherence to compliance, auditing, and data protection requirements (encryption at rest and in transit).
Ensure the performance, reliability, scalability, and high availability of the Kafka platform and data pipelines, including capacity planning, performance tuning, and optimization.
Implement comprehensive monitoring, logging, and alerting using enterprise observability tools (e.g., Prometheus, Grafana, CloudWatch).
Partner with data architects, software engineers, and data scientists to drive the adoption of event‑driven architectures and provide platform guidance and troubleshooting expertise to application teams.
Develop and maintain CI/CD pipelines for automated deployment of data platform resources and configurations.
Heitmeyer Consulting is an equal opportunity employer, and we encourage all qualified candidates to apply. Qualified applicants will be considered without regard to minority status, gender, disability, veteran status or any other characteristic protected by law.
To Apply for this Job Click Here #J-18808-Ljbffr