Worth AI
Job Description
Job Description
Worth AI, a leader in the computer software industry, is looking for a talented and experienced Principal Data Engineer to join their innovative team. At Worth AI, we are on a mission to revolutionize decision-making with the power of artificial intelligence while fostering an environment of collaboration, and adaptability, aiming to make a meaningful impact in the tech landscape. Our team values include extreme ownership, one team and creating reaving fans both for our employees and customers.
Worth is looking for a Principal Data Engineer to own the company-wide data architecture and platform. Design and scale reliable batch / streaming pipelines, institute data quality and governance, and enable analytics / ML with secure, cost-efficient systems. Partner with engineering, product, analytics, and security to turn business needs into durable data products.
Responsibilities What you will do :
Architecture & Strategy
Define end-to-end data architecture (lake / lakehouse / warehouse, batch / streaming, CDC, metadata).
Set standards for schemas, contracts, orchestration, storage layers, and semantic / metrics models.
Publish roadmaps, ADRs / RFCs, and “north star” target states; guide build vs. buy decisions.
Platform & Pipelines
Design and build scalable, observable ELT / ETL and event pipelines.
Establish ingestion patterns (CDC, file, API, message bus) and schema-evolution policies.
Provide self-service tooling for analysts / scientists (dbt, notebooks, catalogs, feature stores).
Ensure workflow reliability (idempotency, retries, backfills, SLAs).
Data Quality & Governance
Define dataset SLAs / SLOs, freshness, lineage, and data certification tiers.
Enforce contracts and validation tests; deploy anomaly detection and incident runbooks.
Partner with governance on cataloging, PII handling, retention, and access policies.
Reliability, Performance & Cost
Lead capacity planning, partitioning / clustering, and query optimization.
Introduce SRE-style practices for data (error budgets, postmortems).
Drive FinOps for storage / compute; monitor and reduce cost per TB / query / job.
Security & Compliance
Implement encryption, tokenization, and row / column-level security; manage secrets and audits.
Align with SOC 2 and privacy regulations (e.g., GDPR / CCPA; HIPAA if applicable).
ML & Analytics Enablement
Deliver versioned, documented datasets / features for BI and ML.
Operationalize training / serving data flows, drift signals, and feature-store governance.
Build and maintain the semantic layer and metrics consistency for experimentation / BI.
Leadership & Collaboration
Provide technical leadership across squads; mentor senior / staff engineers.
Run design reviews and drive consensus on complex trade-offs.
Translate business goals into data products with product / analytics leaders.
Requirements
10+ years in data engineering (including 3+ years as staff / principal or equivalent scope).
Proven leadership of company-wide data architecture and platform initiatives.
Deep experience with at least one cloud (AWS) and a modern warehouse or lakehouse (e.g., Snowflake, Redshift, Databricks).
Strong SQL and one programming language (Python or Scala / Java).
Orchestration (Airflow / Dagster / Prefect), transformations (dbt or equivalent), and streaming (Kafka / Kinesis / PubSub).
Data modeling (3NF, star, data vault) and semantic / metrics layers.
Data quality testing, lineage, and observability in production environments.
Security best practices : RBAC / ABAC, encryption, key management, auditability.
Nice to Have
Feature stores and ML data ops; experimentation frameworks.
Cost optimization at scale; multi-tenant architectures.
Governance tools (DataHub / Collibra / Alation), OpenLineage, and testing frameworks (Great Expectations / Deequ).
Compliance exposure (SOC 2, GDPR / CCPA; HIPAA / PCI where relevant).
Model features sourced from complex 3rd-party data (KYB / KYC, credit bureaus, fraud detection APIs)
Benefits
Health Care Plan (Medical, Dental & Vision)
Retirement Plan (401k, IRA)
Life Insurance
Unlimited Paid Time Off
9 paid Holidays
Family Leave
Work From Home
Free Food & Snacks (Access to Industrious Co-working Membership!)
Wellness Resources
#J-18808-Ljbffr
Job Description
Worth AI, a leader in the computer software industry, is looking for a talented and experienced Principal Data Engineer to join their innovative team. At Worth AI, we are on a mission to revolutionize decision-making with the power of artificial intelligence while fostering an environment of collaboration, and adaptability, aiming to make a meaningful impact in the tech landscape. Our team values include extreme ownership, one team and creating reaving fans both for our employees and customers.
Worth is looking for a Principal Data Engineer to own the company-wide data architecture and platform. Design and scale reliable batch / streaming pipelines, institute data quality and governance, and enable analytics / ML with secure, cost-efficient systems. Partner with engineering, product, analytics, and security to turn business needs into durable data products.
Responsibilities What you will do :
Architecture & Strategy
Define end-to-end data architecture (lake / lakehouse / warehouse, batch / streaming, CDC, metadata).
Set standards for schemas, contracts, orchestration, storage layers, and semantic / metrics models.
Publish roadmaps, ADRs / RFCs, and “north star” target states; guide build vs. buy decisions.
Platform & Pipelines
Design and build scalable, observable ELT / ETL and event pipelines.
Establish ingestion patterns (CDC, file, API, message bus) and schema-evolution policies.
Provide self-service tooling for analysts / scientists (dbt, notebooks, catalogs, feature stores).
Ensure workflow reliability (idempotency, retries, backfills, SLAs).
Data Quality & Governance
Define dataset SLAs / SLOs, freshness, lineage, and data certification tiers.
Enforce contracts and validation tests; deploy anomaly detection and incident runbooks.
Partner with governance on cataloging, PII handling, retention, and access policies.
Reliability, Performance & Cost
Lead capacity planning, partitioning / clustering, and query optimization.
Introduce SRE-style practices for data (error budgets, postmortems).
Drive FinOps for storage / compute; monitor and reduce cost per TB / query / job.
Security & Compliance
Implement encryption, tokenization, and row / column-level security; manage secrets and audits.
Align with SOC 2 and privacy regulations (e.g., GDPR / CCPA; HIPAA if applicable).
ML & Analytics Enablement
Deliver versioned, documented datasets / features for BI and ML.
Operationalize training / serving data flows, drift signals, and feature-store governance.
Build and maintain the semantic layer and metrics consistency for experimentation / BI.
Leadership & Collaboration
Provide technical leadership across squads; mentor senior / staff engineers.
Run design reviews and drive consensus on complex trade-offs.
Translate business goals into data products with product / analytics leaders.
Requirements
10+ years in data engineering (including 3+ years as staff / principal or equivalent scope).
Proven leadership of company-wide data architecture and platform initiatives.
Deep experience with at least one cloud (AWS) and a modern warehouse or lakehouse (e.g., Snowflake, Redshift, Databricks).
Strong SQL and one programming language (Python or Scala / Java).
Orchestration (Airflow / Dagster / Prefect), transformations (dbt or equivalent), and streaming (Kafka / Kinesis / PubSub).
Data modeling (3NF, star, data vault) and semantic / metrics layers.
Data quality testing, lineage, and observability in production environments.
Security best practices : RBAC / ABAC, encryption, key management, auditability.
Nice to Have
Feature stores and ML data ops; experimentation frameworks.
Cost optimization at scale; multi-tenant architectures.
Governance tools (DataHub / Collibra / Alation), OpenLineage, and testing frameworks (Great Expectations / Deequ).
Compliance exposure (SOC 2, GDPR / CCPA; HIPAA / PCI where relevant).
Model features sourced from complex 3rd-party data (KYB / KYC, credit bureaus, fraud detection APIs)
Benefits
Health Care Plan (Medical, Dental & Vision)
Retirement Plan (401k, IRA)
Life Insurance
Unlimited Paid Time Off
9 paid Holidays
Family Leave
Work From Home
Free Food & Snacks (Access to Industrious Co-working Membership!)
Wellness Resources
#J-18808-Ljbffr