Medal, Highlight, & General Intuition

Senior/Lead Data Analyst

Medal, Highlight, & General Intuition, New York, New York, us, 10261

About Medal

At Medal, we’re redefining the way gamers connect, share, and relive their greatest in-game moments. Our platform makes it easy to clip, edit, and share gaming content—whether you're capturing a legendary headshot or just hanging with friends in voice chat. Over 2 million gamers use Medal every month to showcase their moments, and we’re just getting started. We're a fast-moving team backed by top-tier investors. Our culture is builder-first: we move quickly, make decisions with creators in mind, and aren’t afraid to challenge conventions when it improves the product. The Role

We’re looking for a

Senior/Lead Data Analyst

to own the quality of the video data that powers Medal’s machine learning features. You’ll partner closely with ML researchers, data engineering, and product to measure, diagnose, and improve the

accuracy, completeness, and reliability

of our video datasets and labels. If you love turning messy, high-volume media data into trustworthy, measurable assets—and you get excited about building feedback loops that make ML systems smarter—this is for you. You Will

Own the video data quality program : define quality KPIs (coverage, precision/recall, calibration, temporal alignment, label latency, drift) and build dashboards that make them visible company-wide.

Audit datasets at scale

using SQL and Python: create automated checks for codec/bitrate/fps/resolution, audio/video sync, corruption, duplicates, and long-tail coverage by game, device, and region.

Design ground-truth pipelines : human-in-the-loop reviews and labeling guidelines; measure annotator agreement, and iterate to improve label quality.

Diagnose model-data issues : collaborate with ML to localize failure modes, quantify data gaps, and prioritize data collection or relabeling to move accuracy on real user content.

Detect bias and drift

across games, platforms, and cohorts; propose mitigations and monitor post-launch.

Instrument product and ingestion

to capture the metadata ML needs (e.g., encoding, device, frame rate, content type) while respecting privacy and safety constraints.

Run experiments : design and analyze A/Bs and holdouts to connect data quality improvements to model and product outcomes.

Champion best practices

in data contracts, validation, reproducibility, and documentation; mentor analysts and influence data quality culture.

Work on-site

at our

NYC

office

5 days a week .

You Need

5+ years

in data analytics or data science with a focus on

media or ML data quality

in production systems.

Fluency in SQL and Python

(Pandas/NumPy); you’re comfortable building reproducible notebooks and code-reviewed pipelines.

Strong measurement chops : you’ve defined and computed label & model quality metrics (precision/recall/F1, mAP, AUROC, calibration, temporal IoU) and can explain their trade-offs.

Data validation & ETL experience : Great Expectations/TFDV (or equivalent), dbt, and an orchestrator (Airflow/Prefect).

Warehouse & BI : BigQuery (or similar) plus Looker/Mode/Tableau (or similar); you build clear dashboards and know when to run deep dives.

Experimentation : A/B testing design and analysis; comfort with pitfalls and guardrails.

Product sense & communication : you turn ambiguous problems into measurable roadmaps and communicate findings clearly to technical and non-technical partners.

A love for gaming , however you define it.

Bonus Points

Experience running

annotation programs

(Label Studio, CVAT, Scale or custom tooling) and crafting labeling taxonomies for actions/events/scenes.

Hands-on with video tooling : ffmpeg/ffprobe for metadata & probes; familiarity with OpenCV (and running lightweight inference with PyTorch/TensorFlow for scoring/spot checks).

Duplicate/near-duplicate

detection (perceptual hashing, embeddings/FAISS/Milvus) and dataset dedup at scales

Privacy, safety, and policy

considerations for user-generated video (GDPR/COPPA basics, PII redaction, content safety heuristics).

Spark/PySpark

or distributed compute for heavy lifts.

Familiarity with

CV/ASR

signals (scene boundary, keypoint/action recognition, speech-to-text) to enrich labels and audits.

Prior history as a

Medal user —share a clip or your profile!

Why Join Us

Directly shape the

data foundation

behind ML features used by

millions of gamers .

Work with a passionate team that values

ownership, craftsmanship, and speed .

Competitive salary, equity options, comprehensive health insurance, and 401k.

See your work translate into

more accurate models

and

better creator experiences —fast.

#J-18808-Ljbffr