Droisys
Data Engineer with Pytest and PySpark, Databricks experience
Droisys, Plano, Texas, us, 75086
Data Engineer with Pytest and PySpark, Databricks experience
Direct message the job poster from Droisys
Droiys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution. We leverage deep technical expertise, Agile methodologies, and data-driven intelligence to modernize systems of engagement and simplify human/tech interaction.
Amazing things happen when we work in environments where everyone feels a true sense of belonging and when candidates have the requisite skills and opportunities to succeed. At Droiys, we invest in our talent and support career growth, and we are always on the lookout for amazing talent who can contribute to our growth by delivering top results for our clients. Join us to challenge yourself and accomplish work that matters.
Interview Mode Video + F2F
Rate Range $40 to $46 hr W2 All Inc
Job Description We are seeking an experienced
Data Engineer
with a strong background in
PySpark, Databricks, AI/ML integration , and
automated testing using Pytest . The ideal candidate will design, develop, and optimize scalable data pipelines, ensuring reliability and performance across analytical and machine learning workloads. This role requires hands‑on technical expertise, strong analytical thinking, and a collaborative mindset.
Responsibilities
Design, develop, and maintain scalable data pipelines using
PySpark
and
Databricks .
Integrate
AI/ML models
into existing data workflows for predictive analytics and intelligent automation.
Implement robust testing frameworks using
Pytest
to validate data quality, transformations, and pipeline integrity.
Optimize data storage and processing performance across distributed systems.
Collaborate with data scientists, AI engineers, and business analysts to enable machine learning workflows.
Develop CI/CD processes for data pipelines and model deployment within Databricks.
Troubleshoot and resolve performance issues across large-scale datasets.
Document and maintain best practices for coding, testing, and deployment.
Required Skills
Strong experience with PySpark
for large-scale data transformation and ETL.
Hands‑on expertise in Databricks
environment (cluster management, job orchestration, and notebooks).
Proficient in Pytest
for unit testing and data validation automation.
Experience in AI/ML pipelines — model training, evaluation, and integration with data engineering workflows nice to have.
Strong programming skills in
Python
and SQL.
Knowledge of
Delta Lake ,
Azure Data Lake , or
AWS S3
environments.
Familiarity with
CI/CD tools
(Git, Jenkins, Azure DevOps).
Excellent analytical and problem‑solving skills.
Preferred Qualifications
Experience with
MLflow
or
MLOps frameworks .
Exposure to
cloud-based data ecosystems
(Azure, AWS, or GCP).
Bachelor’s or Master’s degree in
Computer Science, Data Engineering, or related field
Droiys is an equal opportunity employer. We do not discriminate based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. Droiys believes in diversity, inclusion, and belonging, and we are committed to fostering a diverse work environment.
Seniority level
Mid‑Senior level
Employment type
Contract
Job function
Information Technology
Industries
IT Services and IT Consulting
Get notified about new Data Engineer jobs in Plano, TX.
#J-18808-Ljbffr
Droiys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution. We leverage deep technical expertise, Agile methodologies, and data-driven intelligence to modernize systems of engagement and simplify human/tech interaction.
Amazing things happen when we work in environments where everyone feels a true sense of belonging and when candidates have the requisite skills and opportunities to succeed. At Droiys, we invest in our talent and support career growth, and we are always on the lookout for amazing talent who can contribute to our growth by delivering top results for our clients. Join us to challenge yourself and accomplish work that matters.
Interview Mode Video + F2F
Rate Range $40 to $46 hr W2 All Inc
Job Description We are seeking an experienced
Data Engineer
with a strong background in
PySpark, Databricks, AI/ML integration , and
automated testing using Pytest . The ideal candidate will design, develop, and optimize scalable data pipelines, ensuring reliability and performance across analytical and machine learning workloads. This role requires hands‑on technical expertise, strong analytical thinking, and a collaborative mindset.
Responsibilities
Design, develop, and maintain scalable data pipelines using
PySpark
and
Databricks .
Integrate
AI/ML models
into existing data workflows for predictive analytics and intelligent automation.
Implement robust testing frameworks using
Pytest
to validate data quality, transformations, and pipeline integrity.
Optimize data storage and processing performance across distributed systems.
Collaborate with data scientists, AI engineers, and business analysts to enable machine learning workflows.
Develop CI/CD processes for data pipelines and model deployment within Databricks.
Troubleshoot and resolve performance issues across large-scale datasets.
Document and maintain best practices for coding, testing, and deployment.
Required Skills
Strong experience with PySpark
for large-scale data transformation and ETL.
Hands‑on expertise in Databricks
environment (cluster management, job orchestration, and notebooks).
Proficient in Pytest
for unit testing and data validation automation.
Experience in AI/ML pipelines — model training, evaluation, and integration with data engineering workflows nice to have.
Strong programming skills in
Python
and SQL.
Knowledge of
Delta Lake ,
Azure Data Lake , or
AWS S3
environments.
Familiarity with
CI/CD tools
(Git, Jenkins, Azure DevOps).
Excellent analytical and problem‑solving skills.
Preferred Qualifications
Experience with
MLflow
or
MLOps frameworks .
Exposure to
cloud-based data ecosystems
(Azure, AWS, or GCP).
Bachelor’s or Master’s degree in
Computer Science, Data Engineering, or related field
Droiys is an equal opportunity employer. We do not discriminate based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. Droiys believes in diversity, inclusion, and belonging, and we are committed to fostering a diverse work environment.
Seniority level
Mid‑Senior level
Employment type
Contract
Job function
Information Technology
Industries
IT Services and IT Consulting
Get notified about new Data Engineer jobs in Plano, TX.
#J-18808-Ljbffr