ZipRecruiter

Principal Data Engineer

ZipRecruiter, Boulder, Colorado, United States, 80301

Job DescriptionJob Description

As a Principal Data Engineer, you will lead a team in building and maintaining scalable, reliable data pipelines that bridge on-premises and cloud environments. You'll leverage your expertise in data engineering, streaming technologies, and leadership to drive the team towards achieving business objectives.

Key Responsibilities:

Team Leadership:

Lead, mentor, and manage a team of data engineers specializing in streaming technologies.

Data Pipeline Development:

Design and implement high-throughput, low-latency streaming data pipelines using Apache Kafka, ensuring integration with cloud services (e.g., BigQuery, Looker).

Data Analytics:

Oversee the development of stream processing applications using Apache Spark or Apache Flink, and implement real-time data transformations and analytics using KSQL.

Data Storage:

Design and maintain scalable data storage solutions with ClickHouse for fast analytics on streaming data.

ETL Processes:

Lead the design and implementation of ETL processes, extracting, transforming, and loading data into a data warehouse.

Data Quality:

Ensure data integrity, consistency, and accuracy through robust data quality assurance.

Optimization:

Optimize performance and implement best practices in data engineering, covering data quality, security, and efficiency.

Collaboration:

Collaborate with stakeholders to gather requirements and align data strategies with business objectives.

Technology Updates:

Stay current with emerging technologies in streaming and cloud environments, evaluating their potential application.

Qualifications:

5+ years of hands-on data engineering experience with Python, Scala, or Java

3+ years of experience with cloud vendors (AWS, Azure, GCP), data warehouse services (e.g., Redshift, Databricks), and cloud storage

Expertise in KSQL, ClickHouse, ETL/ELT tools (e.g., Airflow, ADF, Glue, NiFi), and orchestration

Proficiency in code versioning (Git) and CI/CD pipelines

Experience with stream processing tools (Apache Kafka, Apache Flink, Apache Spark Structured Streaming)

Strong understanding of data modeling, optimization techniques, and stream processing patterns

Excellent leadership and mentorship skills with at least 2 years in a leadership role

Nice to Have:

Experience with NoSQL databases, Kafka Connect, Kafka Streams, Superset for data visualization

Knowledge of integrating real-time machine learning models into streaming environments

Expertise with monitoring and observability tools for streaming systems

Why Join Overwatch Agriculture?

At Overwatch Agriculture, we prioritize people over processes, fostering a supportive and tech-savvy environment. Our customized benefits prioritize your well-being and professional growth, making this an ideal opportunity for tech enthusiasts.