Dexian
Job Title:
Data Engineer AI/ML Pipelines Work Model:
Hybrid (on-site 3 days a week) Location:
Seffner, FL Position Summary The
Data Engineer AI/ML Pipelines
plays a key role in building and optimizing the data infrastructure that powers enterprise analytics and machine learning initiatives. This position focuses on developing robust, scalable, and intelligent data pipelinesfrom ingestion through feature engineering to model deployment and monitoring. The ideal candidate has hands-on experience supporting
end-to-end ML workflows , integrating operational data from
Warehouse Management Systems (WMS)
and
ERP platforms , and enabling
real-time predictive systems . This is a highly collaborative role, working across Data Science, ML Engineering, and Operations to ensure that models are fed with clean, reliable, and production-ready data. Key Responsibilities ML-Focused Data Engineering Build and maintain data pipelines optimized for machine learning workflows and real-time model deployment. Partner with data scientists to prepare, version, and monitor feature sets for retraining and evaluation. Design and implement feature stores, data validation layers, and model input pipelines that ensure scalability and reproducibility. Data Integration from WMS & Operational Systems Ingest, normalize, and enrich data from
WMS, ERP , and telemetry platforms. Model operational data to support predictive analytics and AI-driven warehouse automation use cases. Develop integrations that provide high-quality, structured data to data science and business teams. Pipeline Automation & Orchestration Design, orchestrate, and automate modular pipelines using tools such as
Azure Data Factory ,
Airflow , or
Databricks Workflows . Ensure pipeline reliability, scalability, and monitoring for both batch and streaming use cases. Implement CI/CD practices for data pipelines supporting ML deployment. Data Governance & Quality Establish robust data quality frameworks, anomaly detection, and reconciliation checks. Maintain strong data lineage, versioning, and metadata management to ensure reproducibility and compliance. Contribute to the organizations broader data governance and MLOps standards. Cross-Functional Collaboration Collaborate closely with
Data Scientists, ML Engineers, Software Engineers , and
Operations
teams to translate modeling requirements into technical solutions. Serve as the technical liaison between data engineering and business users for ML-related data needs. Documentation & Mentorship Document data flows, feature transformations, and ML pipeline logic in a reproducible, team-friendly format. Mentor junior data engineers and analysts on ML data architecture and best practices. Required Qualifications Technical Skills Proven experience designing and maintaining
ML-focused data pipelines
and supporting
model lifecycle workflows . Proficient in
Python ,
SQL , and data transformation tools such as
dbt ,
Spark , or
Delta Lake . Strong understanding of
cloud-based data platforms
(Azure, Databricks) and data orchestration frameworks. Familiarity with
ML pipeline tools
such as
MLflow ,
TFX , or Kubeflow. Hands-on experience working with
Warehouse Management Systems (WMS)
or other operational logistics data. Experience 5+ years in
data engineering , with at least 2 years supporting
AI/ML systems . Proven track record building and maintaining
production-grade pipelines
in cloud environments. Demonstrated collaboration with data scientists and experience turning analytical models into operational data products. Education Bachelors degree in
Computer Science, Data Science, Engineering , or related field (Masters preferred). Relevant certifications are a plus (e.g.,
Azure AI Engineer ,
Databricks ML Associate ,
Google Professional Data Engineer ). Preferred Qualifications Experience with
real-time data ingestion
technologies (Kafka, Kinesis, Event Hubs). Exposure to
MLOps
best practices and CI/CD for ML and data pipelines. Industry experience in
logistics, warehouse automation, or supply chain analytics .
Data Engineer AI/ML Pipelines Work Model:
Hybrid (on-site 3 days a week) Location:
Seffner, FL Position Summary The
Data Engineer AI/ML Pipelines
plays a key role in building and optimizing the data infrastructure that powers enterprise analytics and machine learning initiatives. This position focuses on developing robust, scalable, and intelligent data pipelinesfrom ingestion through feature engineering to model deployment and monitoring. The ideal candidate has hands-on experience supporting
end-to-end ML workflows , integrating operational data from
Warehouse Management Systems (WMS)
and
ERP platforms , and enabling
real-time predictive systems . This is a highly collaborative role, working across Data Science, ML Engineering, and Operations to ensure that models are fed with clean, reliable, and production-ready data. Key Responsibilities ML-Focused Data Engineering Build and maintain data pipelines optimized for machine learning workflows and real-time model deployment. Partner with data scientists to prepare, version, and monitor feature sets for retraining and evaluation. Design and implement feature stores, data validation layers, and model input pipelines that ensure scalability and reproducibility. Data Integration from WMS & Operational Systems Ingest, normalize, and enrich data from
WMS, ERP , and telemetry platforms. Model operational data to support predictive analytics and AI-driven warehouse automation use cases. Develop integrations that provide high-quality, structured data to data science and business teams. Pipeline Automation & Orchestration Design, orchestrate, and automate modular pipelines using tools such as
Azure Data Factory ,
Airflow , or
Databricks Workflows . Ensure pipeline reliability, scalability, and monitoring for both batch and streaming use cases. Implement CI/CD practices for data pipelines supporting ML deployment. Data Governance & Quality Establish robust data quality frameworks, anomaly detection, and reconciliation checks. Maintain strong data lineage, versioning, and metadata management to ensure reproducibility and compliance. Contribute to the organizations broader data governance and MLOps standards. Cross-Functional Collaboration Collaborate closely with
Data Scientists, ML Engineers, Software Engineers , and
Operations
teams to translate modeling requirements into technical solutions. Serve as the technical liaison between data engineering and business users for ML-related data needs. Documentation & Mentorship Document data flows, feature transformations, and ML pipeline logic in a reproducible, team-friendly format. Mentor junior data engineers and analysts on ML data architecture and best practices. Required Qualifications Technical Skills Proven experience designing and maintaining
ML-focused data pipelines
and supporting
model lifecycle workflows . Proficient in
Python ,
SQL , and data transformation tools such as
dbt ,
Spark , or
Delta Lake . Strong understanding of
cloud-based data platforms
(Azure, Databricks) and data orchestration frameworks. Familiarity with
ML pipeline tools
such as
MLflow ,
TFX , or Kubeflow. Hands-on experience working with
Warehouse Management Systems (WMS)
or other operational logistics data. Experience 5+ years in
data engineering , with at least 2 years supporting
AI/ML systems . Proven track record building and maintaining
production-grade pipelines
in cloud environments. Demonstrated collaboration with data scientists and experience turning analytical models into operational data products. Education Bachelors degree in
Computer Science, Data Science, Engineering , or related field (Masters preferred). Relevant certifications are a plus (e.g.,
Azure AI Engineer ,
Databricks ML Associate ,
Google Professional Data Engineer ). Preferred Qualifications Experience with
real-time data ingestion
technologies (Kafka, Kinesis, Event Hubs). Exposure to
MLOps
best practices and CI/CD for ML and data pipelines. Industry experience in
logistics, warehouse automation, or supply chain analytics .