Logo
Diverse Lynx

Sr. Data Engineer

Diverse Lynx, San Francisco, California, United States, 94199

Save Job

Job Title: Sr. Data Engineer Location: San Francisco, CA (Remote) Type: Contract

Job Description:

Architect and design metadata-driven, scalable, secure, and cost-efficient Lakehouse solutions on Azure that leverage Delta Lake for ACID transactions, schema enforcement, and time travel. Deep integration with Azure Data Lake Storage (ADLS) enables unified handling of structured, semi-structured, and unstructured data, delivering high-performance analytics with built-in governance, security, lineage, and extensibility at scale. Design Lakehouse-based dimensional data models that align with modern data warehousing principles and support scalable, high-performance analytical workloads and advanced analytics use cases Establish best practices for job scheduling, cluster configuration, modular notebook design, and code reusability Lead development of metadata-driven onboarding templates, pipeline frameworks, and automation tools to accelerate source integration and data processing Ensure every ongoing design is derived through close collaboration between EDM, IT Infrastructure, and IT Security teams to meet architectural, operational, and compliance standards.

Responsibilities: Design and develop scalable batch and streaming data pipelines using PySpark on Azure Databricks, following the medallion architecture pattern (Bronze, Silver, Gold) and applying Delta Lake best practices for reliability, scalability, and data quality. Proficient in PySpark-based distributed data processing; well-versed in Delta Lake, Auto Loader, Structured Streaming, and Delta Live Tables (DLT) to build reliable, high-throughput data pipelines, with additional experience leveraging Databricks SDK and REST APIs for workflow automation, job orchestration, and operational monitoring. Strong working experience with Spark internals and PySpark constructs, including Data Frame APIs, UDFs, window functions, complex joins, and performance profiling, while adhering to best practices for optimization, partitioning, schema evolution, and ACID-compliant Delta Lake writes. Build Lakehouse-aligned dimensional data models to support high-performance analytics, BI, and operational reporting. Well-versed in dimensional modeling techniques (e.g., Ralph Kimball) with hands-on experience implementing dimensional models in at least two cloud Lakehouse projects. Extensive knowledge and hands-on experience required in building PySpark-based data pipelines incorporating Change Data Capture (CDC) and Slowly Changing Dimension (SCD) Type 2 logic. Contribute to the development of modular, reusable transformation components and notebooks using standardized framework patterns. Should have experience implementing Spark Structured Streaming pipelines with checkpointing, watermarking, and trigger-based scheduling. Build and orchestrate robust data workflows using Databricks Workflows, incorporating task dependencies, parameterization, retry logic, timeout handling, and alerting mechanisms to ensure reliable and maintainable pipeline execution. Apply performance tuning techniques including caching, adaptive query execution, optimal file sizing, and broadcast joins. Strong understanding of data governance principles with hands-on experience using Unity Catalog to manage schema organization, enforce role-based access control (RBAC), implement fine-grained permissions, and enable end-to-end data lineage tracking across the Lakehouse. Automate deployments and workflow operations using REST APIs and Databricks SDKs with support for parameterization and CI/CD integration. Experience with Azure Data Factory (ADF) for building ETL/ELT pipelines and integrating with Databricks for seamless data orchestration. Experience with Azure Data Services and their seamless integration with Azure Databricks. Nice to Have:

Familiarity with IBM DataStage, including migration of legacy workloads. Proficiency in Power BI for advanced dashboard creation, reporting, and data visualization

Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven capability to perform the functions outlined in the corresponding role. We promote and support a diverse workforce across all levels in the company.