Anblicks
We are seeking a seasoned
Data Engineer
with strong proficiency in
SQL, Python, and PySpark
to support high-performance data pipelines and analytics initiatives in the
Mortgage Banking domain . This role will focus on scalable data processing, transformation, and integration efforts that enable business insights, regulatory compliance, and operational efficiency.
Data Engineer - SQL, Python and Pyspark Expert (Onsite - Dallas, TX)
Key Responsibilities
Design, develop, and optimize
ETL/ELT pipelines
using
SQL, Python, and PySpark
for large-scale data environments
Implement scalable
data processing workflows
in distributed data platforms (e.g., Hadoop, Databricks, or Spark environments)
Partner with business stakeholders to understand and model
mortgage lifecycle data
(origination, underwriting, servicing, foreclosure, etc.)
Create and maintain
data marts, views, and reusable data components
to support downstream reporting and analytics
Ensure
data quality, consistency, security, and lineage
across all stages of data processing
Assist in
data migration and modernization efforts
to cloud-based data warehouses (e.g., Snowflake, Azure Synapse, GCP BigQuery)
Document data flows, logic, and transformation rules
Troubleshoot performance and quality issues in batch and real-time pipelines
Support compliance-related reporting (e.g., HMDA, CFPB)
Required Qualifications
6+ years
of experience in
data engineering or data development
Advanced expertise in
SQL
(joins, CTEs, optimization, partitioning, etc.)
Strong hands-on skills in
Python
for scripting, data wrangling, and automation
Proficient in
PySpark
for building distributed data pipelines and processing large volumes of structured/unstructured data
Experience working with
mortgage banking data sets
and domain knowledge is
highly preferred
Strong understanding of
data modeling
(dimensional, normalized, star schema)
Experience with cloud-based platforms (e.g.,
Azure Databricks ,
AWS EMR ,
GCP Dataproc )
Familiarity with
ETL tools, orchestration frameworks
(e.g., Airflow, ADF, dbt)
Data Engineer
with strong proficiency in
SQL, Python, and PySpark
to support high-performance data pipelines and analytics initiatives in the
Mortgage Banking domain . This role will focus on scalable data processing, transformation, and integration efforts that enable business insights, regulatory compliance, and operational efficiency.
Data Engineer - SQL, Python and Pyspark Expert (Onsite - Dallas, TX)
Key Responsibilities
Design, develop, and optimize
ETL/ELT pipelines
using
SQL, Python, and PySpark
for large-scale data environments
Implement scalable
data processing workflows
in distributed data platforms (e.g., Hadoop, Databricks, or Spark environments)
Partner with business stakeholders to understand and model
mortgage lifecycle data
(origination, underwriting, servicing, foreclosure, etc.)
Create and maintain
data marts, views, and reusable data components
to support downstream reporting and analytics
Ensure
data quality, consistency, security, and lineage
across all stages of data processing
Assist in
data migration and modernization efforts
to cloud-based data warehouses (e.g., Snowflake, Azure Synapse, GCP BigQuery)
Document data flows, logic, and transformation rules
Troubleshoot performance and quality issues in batch and real-time pipelines
Support compliance-related reporting (e.g., HMDA, CFPB)
Required Qualifications
6+ years
of experience in
data engineering or data development
Advanced expertise in
SQL
(joins, CTEs, optimization, partitioning, etc.)
Strong hands-on skills in
Python
for scripting, data wrangling, and automation
Proficient in
PySpark
for building distributed data pipelines and processing large volumes of structured/unstructured data
Experience working with
mortgage banking data sets
and domain knowledge is
highly preferred
Strong understanding of
data modeling
(dimensional, normalized, star schema)
Experience with cloud-based platforms (e.g.,
Azure Databricks ,
AWS EMR ,
GCP Dataproc )
Familiarity with
ETL tools, orchestration frameworks
(e.g., Airflow, ADF, dbt)