R CUBE CREATIVE CONSULTING INC
Responsibilities
Design, build, and optimize
ELT-based
data pipelines that are reliable, scalable, and aligned to business goals
Own and evolve
data architecture
(data lake/warehouse/lakehouse patterns) including ingestion, transformation, and serving layers
Architect and implement
end-to-end data solutions
from source systems to curated analytics‑ready datasets
Build and maintain
data models
(dimensional modeling, curated marts) to support reporting, BI, and advanced analytics
Develop and operationalize
data products
(reusable datasets, semantic layers, curated tables/views, standardized metrics) for analysts/data scientists
Improve data quality through
validation, testing, monitoring, lineage, and alerting ; drive data reliability and SLAs
Partner closely with
data science/ML engineers
on feature engineering, training datasets, and production‑ready pipelines
Establish and enforce
data engineering standards : coding practices, reviews, documentation, naming conventions, and pipeline design patterns
Lead complex projects: scope, roadmap, technical direction, risk management, and cross‑team coordination
Evaluate and introduce emerging tools/processes to improve productivity, cost, and performance (e.g., orchestration, transformation, observability)
Mentor junior engineers and promote a culture of
reuse, scalability, operational excellence, and knowledge sharing
Qualifications
7+ years
of hands‑on data engineering experience (senior/lead level ownership)
Strong expertise in
SQL
(advanced joins, window functions, performance tuning, query optimization)
Proven experience building scalable pipelines using modern DE tools such as:
Spark/Databricks
(distributed processing, performance tuning, partitioning, job optimization)
DBT
(modular transformations, tests, documentation, environments)
Python
within the OSS data ecosystem (pandas/pySpark, APIs, automation, packaging)
Solid experience with
data warehouses/distributed databases
such as
Snowflake/Redshift
(or similar) including modeling and transformations, load strategies, clustering/partitioning, and cost/performance considerations
Strong understanding of
data architecture & scalability : distributed systems concepts, batch vs. streaming patterns, storage formats (Parquet/Delta), and data lifecycle management
Proficiency with
software engineering fundamentals : Git workflows, CI/CD pipelines, code reviews, testing strategies, and tools like
JIRA
Comfortable in
Linux
environments and scripting with
Bash/Z shell
Hands‑on experience with cloud data platforms (e.g.,
Azure )—compute, storage, security, networking basics for data workloads
Strong critical thinking and problem‑solving skills—can troubleshoot pipeline failures, data issues, performance bottlenecks, and ambiguous requirements
Ability to lead/mentor and collaborate cross‑functionally with BI, ML, analysts, and solution architects
Nice‑to‑have Skills
Familiarity with BI tools like
Power BI ,
Tableau ,
Looker ,
Alteryx
Knowledge of data governance, metadata/lineage, and observability practices
Exposure to ML concepts (feature engineering, datasets, model handoffs)
#J-18808-Ljbffr
Design, build, and optimize
ELT-based
data pipelines that are reliable, scalable, and aligned to business goals
Own and evolve
data architecture
(data lake/warehouse/lakehouse patterns) including ingestion, transformation, and serving layers
Architect and implement
end-to-end data solutions
from source systems to curated analytics‑ready datasets
Build and maintain
data models
(dimensional modeling, curated marts) to support reporting, BI, and advanced analytics
Develop and operationalize
data products
(reusable datasets, semantic layers, curated tables/views, standardized metrics) for analysts/data scientists
Improve data quality through
validation, testing, monitoring, lineage, and alerting ; drive data reliability and SLAs
Partner closely with
data science/ML engineers
on feature engineering, training datasets, and production‑ready pipelines
Establish and enforce
data engineering standards : coding practices, reviews, documentation, naming conventions, and pipeline design patterns
Lead complex projects: scope, roadmap, technical direction, risk management, and cross‑team coordination
Evaluate and introduce emerging tools/processes to improve productivity, cost, and performance (e.g., orchestration, transformation, observability)
Mentor junior engineers and promote a culture of
reuse, scalability, operational excellence, and knowledge sharing
Qualifications
7+ years
of hands‑on data engineering experience (senior/lead level ownership)
Strong expertise in
SQL
(advanced joins, window functions, performance tuning, query optimization)
Proven experience building scalable pipelines using modern DE tools such as:
Spark/Databricks
(distributed processing, performance tuning, partitioning, job optimization)
DBT
(modular transformations, tests, documentation, environments)
Python
within the OSS data ecosystem (pandas/pySpark, APIs, automation, packaging)
Solid experience with
data warehouses/distributed databases
such as
Snowflake/Redshift
(or similar) including modeling and transformations, load strategies, clustering/partitioning, and cost/performance considerations
Strong understanding of
data architecture & scalability : distributed systems concepts, batch vs. streaming patterns, storage formats (Parquet/Delta), and data lifecycle management
Proficiency with
software engineering fundamentals : Git workflows, CI/CD pipelines, code reviews, testing strategies, and tools like
JIRA
Comfortable in
Linux
environments and scripting with
Bash/Z shell
Hands‑on experience with cloud data platforms (e.g.,
Azure )—compute, storage, security, networking basics for data workloads
Strong critical thinking and problem‑solving skills—can troubleshoot pipeline failures, data issues, performance bottlenecks, and ambiguous requirements
Ability to lead/mentor and collaborate cross‑functionally with BI, ML, analysts, and solution architects
Nice‑to‑have Skills
Familiarity with BI tools like
Power BI ,
Tableau ,
Looker ,
Alteryx
Knowledge of data governance, metadata/lineage, and observability practices
Exposure to ML concepts (feature engineering, datasets, model handoffs)
#J-18808-Ljbffr