Logo
Worth AI

Principal Data Engineer

Worth AI, Miami

Save Job

Worth AI, a leader in the computer software industry, is looking for a talented and experienced Principal Data Engineer to join their innovative team. At Worth AI, we are on a mission to revolutionize decision‑making with the power of artificial intelligence while fostering an environment of collaboration, adaptability, and a commitment to make a meaningful impact in the tech landscape. Our team values include extreme ownership, one team, and creating fans both for our employees and customers.

Responsibilities

What you will do:

  • Architecture & Strategy
    • Define end‑to‑end data architecture (lake/lakehouse/warehouse, batch/streaming, CDC, metadata)
    • Set standards for schemas, contracts, orchestration, storage layers, and semantic/metrics models
    • Publish roadmaps, ADRs/RFCs, and "north star" target states; guide build vs. buy decisions
  • Platform & Pipelines
    • Design and build scalable, observable ELT/ETL and event pipelines
    • Establish ingestion patterns (CDC, file, API, message bus) and schema‑evolution policies
    • Provide self‑service tooling for analysts/scientists (dbt, notebooks, catalogs, feature stores)
    • Ensure workflow reliability (idempotency, retries, backfills, SLAs)
  • Data Quality & Governance
    • Define dataset SLAs/SLOs, freshness, lineage, and data certification tiers
    • Enforce contracts and validation tests; deploy anomaly detection and incident runbooks
    • Partner with governance on cataloging, PII handling, retention, and access policies
  • Reliability, Performance & Cost
    • Lead capacity planning, partitioning/clustering, and query optimization
    • Introduce SRE‑style practices for data (error budgets, postmortems)
    • Drive FinOps for storage/compute; monitor and reduce cost per TB/query/job
  • Security & Compliance
    • Implement encryption, tokenization, and row/column‑level security; manage secrets and audits
    • Align with SOC 2 and privacy regulations (e.g., GDPR/CCPA; HIPAA if applicable)
  • ML & Analytics Enablement
    • Deliver versioned, documented datasets/features for BI and ML
    • Operationalize training/serving data flows, drift signals, and feature‑store governance
    • Build and maintain the semantic layer and metrics consistency for experimentation/BI
  • Leadership & Collaboration
    • Provide technical leadership across squads; mentor senior/staff engineers
    • Run design reviews and drive consensus on complex trade‑offs
    • Translate business goals into data products with product/analytics leaders

Requirements

  • 10+ years in data engineering (including 3+ years as staff/principal or equivalent scope)
  • Proven leadership of company‑wide data architecture and platform initiatives
  • Deep experience with at least one cloud (AWS) and a modern warehouse or lakehouse (e.g., Snowflake, Redshift, Databricks)
  • Strong SQL and one programming language (Python or Scala/Java)
  • Orchestration (Airflow/Dagster/Prefect), transformations (dbt or equivalent), and streaming (Kafka/Kinesis/PubSub)
  • Data modeling (3NF, star, data vault) and semantic/metrics layers
  • Data quality testing, lineage, and observability in production environments
  • Security best practices: RBAC/ABAC, encryption, key management, auditability

Nice to Have

  • Feature stores and ML data ops; experimentation frameworks
  • Cost optimization at scale; multi‑tenant architectures
  • Governance tools (DataHub/Collibra/Alation), OpenLineage, and testing frameworks (Great Expectations/Deequ)
  • Compliance exposure (SOC 2, GDPR/CCPA; HIPAA/PCI where relevant)
  • Model features sourced from complex 3rd‑party data (KYB/KYC, credit bureaus, fraud detection APIs)

Benefits

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k, IRA)
  • Life Insurance
  • Unlimited Paid Time Off
  • 9 paid Holidays
  • Family Leave
  • Work From Home
  • Free Food & Snacks (Access to Industrious Co‑working Membership!)
  • Wellness Resources

#J-18808-Ljbffr