Droisys

Data Engineer with Pytest and PySpark, Databricks experience

Droisys, Plano, Texas, us, 75086

Data Engineer with Pytest and PySpark, Databricks experience Direct message the job poster from Droisys

Droiys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution. We leverage deep technical expertise, Agile methodologies, and data-driven intelligence to modernize systems of engagement and simplify human/tech interaction.

Amazing things happen when we work in environments where everyone feels a true sense of belonging and when candidates have the requisite skills and opportunities to succeed. At Droiys, we invest in our talent and support career growth, and we are always on the lookout for amazing talent who can contribute to our growth by delivering top results for our clients. Join us to challenge yourself and accomplish work that matters.

Interview Mode Video + F2F

Rate Range $40 to $46 hr W2 All Inc

Job Description We are seeking an experienced

Data Engineer

with a strong background in

PySpark, Databricks, AI/ML integration , and

automated testing using Pytest . The ideal candidate will design, develop, and optimize scalable data pipelines, ensuring reliability and performance across analytical and machine learning workloads. This role requires hands‑on technical expertise, strong analytical thinking, and a collaborative mindset.

Responsibilities

Design, develop, and maintain scalable data pipelines using

PySpark

and

Databricks .

Integrate

AI/ML models

into existing data workflows for predictive analytics and intelligent automation.

Implement robust testing frameworks using

Pytest

to validate data quality, transformations, and pipeline integrity.

Optimize data storage and processing performance across distributed systems.

Collaborate with data scientists, AI engineers, and business analysts to enable machine learning workflows.

Develop CI/CD processes for data pipelines and model deployment within Databricks.

Troubleshoot and resolve performance issues across large-scale datasets.

Document and maintain best practices for coding, testing, and deployment.

Required Skills

Strong experience with PySpark

for large-scale data transformation and ETL.

Hands‑on expertise in Databricks

environment (cluster management, job orchestration, and notebooks).

Proficient in Pytest

for unit testing and data validation automation.

Experience in AI/ML pipelines — model training, evaluation, and integration with data engineering workflows nice to have.

Strong programming skills in

Python

and SQL.

Knowledge of

Delta Lake ,

Azure Data Lake , or

AWS S3

environments.

Familiarity with

CI/CD tools

(Git, Jenkins, Azure DevOps).

Excellent analytical and problem‑solving skills.

Preferred Qualifications

Experience with

MLflow

or

MLOps frameworks .

Exposure to

cloud-based data ecosystems

(Azure, AWS, or GCP).

Bachelor’s or Master’s degree in

Computer Science, Data Engineering, or related field

Droiys is an equal opportunity employer. We do not discriminate based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. Droiys believes in diversity, inclusion, and belonging, and we are committed to fostering a diverse work environment.

Seniority level

Mid‑Senior level

Employment type

Contract

Job function

Information Technology

Industries

IT Services and IT Consulting

Get notified about new Data Engineer jobs in Plano, TX.

#J-18808-Ljbffr