Logo
Discovered MENA

Lead Data Scientist - Remote

Discovered MENA, San Francisco, California, United States, 94199

Save Job

Artificial Intelligence & Data Talent Partner at Discovered MENA Remote Role

What you’ll be working on:

Designing and building hybrid ML models that combine supervised learning, time-series forecasting, and NLP to extract insights from unstructured data like PDFs, fund memos, and regulatory filings

Adding explainability to models using techniques like SHAP, LIME, and feature attribution so outputs are transparent and human-readable

Building scalable data pipelines across off-chain fundamentals, on-chain activity, and macro benchmarks

Integrating data from sources like FRED, PitchBook LCD, Securitize, Centrifuge, Maple, and TrueFi, with strong data lineage and freshness guarantees

Developing anomaly detection and reconciliation tools across issuer, administrator, and blockchain datasets

Creating evaluation frameworks to measure accuracy, confidence intervals, latency, and data quality

Backtesting model outputs against historical NAVs, secondary-market trades, and redemptions

Researching and incorporating credit-risk signals (CDS spreads, recovery rates, default data, etc.)

Building continuous learning loops using live market data and partner feedback

Working closely with Product and Engineering to ship models via APIs, SDKs, and dashboards used by traders, curators, and risk teams

Collaborating with data providers, protocol teams, and fund administrators to improve coverage and signal quality

Partnering with the CTO on long-term model governance, transparency, and AI ethics

What I’m looking for:

5+ years of experience in applied ML, quantitative finance, or credit-risk modeling

Strong Python and SQL skills, plus experience with ML frameworks like PyTorch, TensorFlow, scikit-learn, or XGBoost

Solid understanding of time-series forecasting, regression/classification, and probabilistic modeling

Hands-on experience with financial data (fixed income, private credit, or structured products)

Familiarity with blockchain and DeFi data, including smart contracts, token metadata, and on-chain events

Experience deploying ML models into production (APIs, orchestration, or streaming systems)

Bonus to have:

Background in credit analytics, NAV valuation, or structured credit

Experience in quant research, fintech data science, or tokenized asset analytics

Experience with NLP, vector databases, and LLMs / GenAI tools (OpenAI APIs, GPT-4, LangChain, HuggingFace, etc.)

Seniority level Mid‑Senior level

Employment type Full‑time

Job function Science and Information Technology

#J-18808-Ljbffr