Logo
WorkHQ

Lead Data Engineer at WorkHQ Los Angeles, CA

WorkHQ, Los Angeles, California, United States, 90079

Save Job

Lead Data Engineer job at WorkHQ. Los Angeles, CA. Company Context Series A, well-funded US startup in HRTech developing WorkHQ.com and an AI Recruiter product.

This is a US-only, Remote role (Mainland).

Role Overview Lead data infrastructure architect managing billions of data points across 250M+ professional profiles.

Hire data engineers to aid you in that journey.

Core Responsibilities

Design scalable data pipelines processing massive record volumes

Architect ETL processes using PySpark on Amazon EMR (Open to shifting to other solutions like Data Bricks / Snowflake)

Distribute enriched data through medallion architecture across Postgres, Athena, OpenSearch

Integrate new data sources into the main pipeline

Implement advanced data matching using Splink

Technical Requirements

5-8 years professional data engineering experience

Good proficiency in:

PySpark and distributed computing

AWS data services (EMR, Glue, Athena)

Docker

Pandas and DataFrame manipulation

Complex data format handling (JSONL, Parquet)

Strong background in:

Big data processing architectures

Data warehouse design

Performance optimization

Advanced Python, SQL skills

Nice to Have

Probabilistic record linking expertise

OpenSearch/elasticsearch technologies

Machine learning data pipeline design

Recruitment tech ecosystem knowledge

Technical Stack

Big Data: PySpark, EMR

Databases: Postgres, OpenSearch

Cloud: AWS

Containerization: Docker

Data Formats: JSONL, Parquet

Analytics: Metabase, Athena, Glue

Data Processing: Pandas, Splink

Other Considerations While this role has specific requirements - if you lack a few technical skills, but motivated to learn and lead the platform, please apply for consideration.

If you are coming from Director/Head of/VP levels that is relevant to this job, you can apply as well.

You will need to apply directly on our platform.

Thank you for your time.

#J-18808-Ljbffr