Logo
Medal

Data Engineer - Clips/ML Data

Medal, New York, New York, us, 10261

Save Job

The Role

At Medal, we're redefining how people capture and share gameplay experiences. Every day, our platform ingests gameplay video that is raw, unfiltered, and packed with insights. We're looking for a seasoned Data Engineer to take full ownership of our Clips/ML data infrastructure, building the next generation of scalable, real-time pipelines that power everything from user-facing discovery to machine learning research. You'll lead the architecture, operations, and performance of data systems that sit at the heart of our product, influencing everything from content indexing to model training.

If you're passionate about building petabyte-scale video pipelines, love working on low-latency systems, and are excited to help define the future of real-time gaming insights, we want to hear from you.

You Will Architect and operate

petabyte-scale ingestion pipelines Design automated

QA guard-rails

(schema validation, anomaly detection, deduplication) Build high-performance ETL and feature-extraction jobs to

process and index hundreds of millions of clips

into columnar/video-native formats Own the end-to-end

data ingestion stack

(desktop & mobile recorders, upload services, CDN) Establish

real-time monitoring, lineage, and "five-nines" SLAs,

driving continuous improvement across storage, compute, and network layers Partner with research and product to curate high-signal data slices, data-health metrics, and accelerate model experimentation Champion

security, privacy, and governance : implement robust RBAC, audit trails, and compliant retention policies for sensitive gameplay footage and user inputs Mentor and uplevel engineers

(including internal Medal platform talent), fostering a culture of craftsmanship, documentation, and ruthless focus on data excellence You Need 5+ years of experience in

data engineering, backend systems , or related roles. Experience with

video data

or

ML infrastructur e is a plus. Deep knowledge of

ETL/ELT pipelines ,

distributed systems , and

streaming data architectures

(e.g., Kafka, Spark, Flink, etc.) Strong proficiency with

Python, Scala, Go , or similar languages used in data-intensive environments Experience with

cloud infrastructure

(e.g., AWS, GCP) and modern data stack tools (e.g., dbt, Airflow, Parquet, Arrow) Track record of designing systems with

extreme scale and performance

requirements Experience consolidating data from diverse sources into unified data models Deep understanding of

data QA methodologies , anomaly detection, and automated testing in production systems Passion for

mentorship and team development ; able to upskill engineers and advocate for engineering excellence A bias toward

ownership, urgency , and a desire to build systems that just work, even at scale Why Join Us Work on cutting-edge tech and help shape the future of gaming Passionate team that values ownership and innovation Competitive salary, equity options, comprehensive health insurance, 401k