Anblicks
Job Title: Data Engineer (Python, PySpark, Databricks)
Data Engineer
with strong proficiency in
SQL, Python, and PySpark
to support high‑performance data pipelines and analytics initiatives. This role will focus on scalable data processing, transformation, and integration efforts that enable business insights, regulatory compliance, and operational efficiency.
Key Responsibilities
Design, develop, and optimize
ETL/ELT pipelines
using
SQL, Python, and PySpark
for large‑scale data environments
Implement scalable
data processing workflows
in distributed data platforms (e.g., Hadoop, Databricks, or Spark environments)
Partner with business stakeholders to understand and model
mortgage lifecycle data
(origination, underwriting, servicing, foreclosure, etc.)
Create and maintain
data marts, views, and reusable data components
to support downstream reporting and analytics
Ensure
data quality, consistency, security, and lineage
across all stages of data processing
Assist in
data migration and modernization efforts
to cloud‑based data warehouses (e.g., Snowflake, Azure Synapse, GCP BigQuery)
Document data flows, logic, and transformation rules
Troubleshoot performance and quality issues in batch and real‑time pipelines
Support compliance‑related reporting (e.g., HMDA, CFPB)
Required Qualifications
6+ years of experience in
data engineering or data development
Advanced expertise in
SQL
(joins, CTEs, optimization, partitioning, etc.)
Strong hands‑on skills in
Python
for scripting, data wrangling, and automation
Proficient in
PySpark
for building distributed data pipelines and processing large volumes of structured/unstructured data
Experience working with
mortgage banking data sets
and domain knowledge is
highly preferred
Strong understanding of
data modeling
(dimensional, normalized, star schema)
Experience with cloud‑based platforms (e.g., Azure Databricks, AWS EMR, GCP Dataproc)
Familiarity with
ETL tools, orchestration frameworks
(e.g., Airflow, ADF, dbt)
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology, Design, and Education
Industries IT Services and IT Consulting, Design Services, and Engineering Services
Location & Compensation Dallas, TX – $109,000.00 – $155,500.00
Apply Direct message the job poster from Anblicks.
#J-18808-Ljbffr
with strong proficiency in
SQL, Python, and PySpark
to support high‑performance data pipelines and analytics initiatives. This role will focus on scalable data processing, transformation, and integration efforts that enable business insights, regulatory compliance, and operational efficiency.
Key Responsibilities
Design, develop, and optimize
ETL/ELT pipelines
using
SQL, Python, and PySpark
for large‑scale data environments
Implement scalable
data processing workflows
in distributed data platforms (e.g., Hadoop, Databricks, or Spark environments)
Partner with business stakeholders to understand and model
mortgage lifecycle data
(origination, underwriting, servicing, foreclosure, etc.)
Create and maintain
data marts, views, and reusable data components
to support downstream reporting and analytics
Ensure
data quality, consistency, security, and lineage
across all stages of data processing
Assist in
data migration and modernization efforts
to cloud‑based data warehouses (e.g., Snowflake, Azure Synapse, GCP BigQuery)
Document data flows, logic, and transformation rules
Troubleshoot performance and quality issues in batch and real‑time pipelines
Support compliance‑related reporting (e.g., HMDA, CFPB)
Required Qualifications
6+ years of experience in
data engineering or data development
Advanced expertise in
SQL
(joins, CTEs, optimization, partitioning, etc.)
Strong hands‑on skills in
Python
for scripting, data wrangling, and automation
Proficient in
PySpark
for building distributed data pipelines and processing large volumes of structured/unstructured data
Experience working with
mortgage banking data sets
and domain knowledge is
highly preferred
Strong understanding of
data modeling
(dimensional, normalized, star schema)
Experience with cloud‑based platforms (e.g., Azure Databricks, AWS EMR, GCP Dataproc)
Familiarity with
ETL tools, orchestration frameworks
(e.g., Airflow, ADF, dbt)
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology, Design, and Education
Industries IT Services and IT Consulting, Design Services, and Engineering Services
Location & Compensation Dallas, TX – $109,000.00 – $155,500.00
Apply Direct message the job poster from Anblicks.
#J-18808-Ljbffr