Info Way Solutions
Taxonomist / Data Quality Analyst
We are seeking a hands‑on Taxonomist / Data Quality Analyst to play a key role in structuring, standardizing, and enriching our data assets. This individual will help define and capture metadata, build and maintain taxonomies, and drive continuous improvement in data quality. You’ll work across teams to make data more discoverable, consistent, and trustworthy, ensuring it supports decision‑making, analytics, and content discovery across our media ecosystem.
Key Responsibilities
Data Discovery & Profiling:
Conduct data inventory, assess data health, completeness, and lineage; identify patterns, anomalies, and areas for improvement.
Metadata Modeling:
Define and maintain data dictionaries, business glossaries, and metadata schemas (both technical and business); standardize naming conventions and documentation practices.
Taxonomy & Classification:
Design and manage controlled vocabularies, tagging schemes, and hierarchical taxonomies; map synonyms and relationships; apply metadata at scale.
Data Quality Improvement:
Implement validation rules, deduplication logic, and standardization workflows; monitor data quality and resolve defects through collaboration with engineering teams.
Canonical Mapping:
Align disparate data sources into a unified schema; document transformations, lineage, and provenance.
Cataloging & Stewardship:
Populate and maintain a centralized data catalog with lineage, ownership, usage notes, and access classifications.
Engage with data engineering, BI, and product teams to align definitions, capture use cases, and integrate metadata and quality checks into pipelines.
Documentation & Enablement:
Publish standards, best practices, and change logs; train teams on taxonomy usage, metadata conventions, and stewardship principles.
Compliance & Ethics:
Apply retention, sensitivity, and privacy tags (e.g., PII); ensure adherence to governance policies and elevate compliance risks as needed.
Required Qualifications
Bachelor’s degree in Information Science, Data Management, Computer Science, or related field.
3+ years
of experience in data analysis, data stewardship, taxonomy, or metadata management, ideally within the media & entertainment domain.
Strong
SQL
skills for profiling and transformation; experience with
Python
or
R
(pandas, data wrangling) preferred.
Experience defining data dictionaries, glossaries, and metadata schemas.
Hands‑on experience with data quality processes—deduplication, validation, standardization, and root‑cause analysis.
Clear and concise communication skills with the ability to bridge technical and business contexts.
Familiarity with data cataloging and taxonomy tools such as Alation, Collibra, Atlan, DataHub, PoolParty, SmartLogic, or Synaptica.
Knowledge of metadata standards (e.g., Dublin Core, schema.org) and data privacy regulations (GDPR, CCPA).
Success Metrics
% of priority datasets cataloged with complete metadata and ownership.
Reduction in data quality issues (duplicates, invalid values, nulls).
Improved time‑to‑discover datasets/fields; increase in catalog search success.
Organization‑wide adoption of taxonomy and controlled vocabularies.
First 90 Days Objectives
Audit and profile key datasets across Apple TV+, Apple Music, and App Store, producing a lightweight data health report.
Stand up or enhance a data dictionary and align naming standards with key stakeholders.
Define an initial taxonomy and tagging framework, applying it to top‑priority datasets.
Establish a set of data quality rules and monitoring metrics.
Document data processes and recommend tool or workflow improvements for scalability.
Tooling & Technologies
Catalog / Lineage:
Alation, Collibra, Atlan, DataHub, OpenMetadata
Workflow / Version Control:
Git, Jira
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology
Industries IT Services and IT Consulting
Los Angeles, CA | $66,785.00‑$96,690.00
#J-18808-Ljbffr
Key Responsibilities
Data Discovery & Profiling:
Conduct data inventory, assess data health, completeness, and lineage; identify patterns, anomalies, and areas for improvement.
Metadata Modeling:
Define and maintain data dictionaries, business glossaries, and metadata schemas (both technical and business); standardize naming conventions and documentation practices.
Taxonomy & Classification:
Design and manage controlled vocabularies, tagging schemes, and hierarchical taxonomies; map synonyms and relationships; apply metadata at scale.
Data Quality Improvement:
Implement validation rules, deduplication logic, and standardization workflows; monitor data quality and resolve defects through collaboration with engineering teams.
Canonical Mapping:
Align disparate data sources into a unified schema; document transformations, lineage, and provenance.
Cataloging & Stewardship:
Populate and maintain a centralized data catalog with lineage, ownership, usage notes, and access classifications.
Engage with data engineering, BI, and product teams to align definitions, capture use cases, and integrate metadata and quality checks into pipelines.
Documentation & Enablement:
Publish standards, best practices, and change logs; train teams on taxonomy usage, metadata conventions, and stewardship principles.
Compliance & Ethics:
Apply retention, sensitivity, and privacy tags (e.g., PII); ensure adherence to governance policies and elevate compliance risks as needed.
Required Qualifications
Bachelor’s degree in Information Science, Data Management, Computer Science, or related field.
3+ years
of experience in data analysis, data stewardship, taxonomy, or metadata management, ideally within the media & entertainment domain.
Strong
SQL
skills for profiling and transformation; experience with
Python
or
R
(pandas, data wrangling) preferred.
Experience defining data dictionaries, glossaries, and metadata schemas.
Hands‑on experience with data quality processes—deduplication, validation, standardization, and root‑cause analysis.
Clear and concise communication skills with the ability to bridge technical and business contexts.
Familiarity with data cataloging and taxonomy tools such as Alation, Collibra, Atlan, DataHub, PoolParty, SmartLogic, or Synaptica.
Knowledge of metadata standards (e.g., Dublin Core, schema.org) and data privacy regulations (GDPR, CCPA).
Success Metrics
% of priority datasets cataloged with complete metadata and ownership.
Reduction in data quality issues (duplicates, invalid values, nulls).
Improved time‑to‑discover datasets/fields; increase in catalog search success.
Organization‑wide adoption of taxonomy and controlled vocabularies.
First 90 Days Objectives
Audit and profile key datasets across Apple TV+, Apple Music, and App Store, producing a lightweight data health report.
Stand up or enhance a data dictionary and align naming standards with key stakeholders.
Define an initial taxonomy and tagging framework, applying it to top‑priority datasets.
Establish a set of data quality rules and monitoring metrics.
Document data processes and recommend tool or workflow improvements for scalability.
Tooling & Technologies
Catalog / Lineage:
Alation, Collibra, Atlan, DataHub, OpenMetadata
Workflow / Version Control:
Git, Jira
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology
Industries IT Services and IT Consulting
Los Angeles, CA | $66,785.00‑$96,690.00
#J-18808-Ljbffr