Discovered MENA
Lead Data Scientist - Remote
Discovered MENA, San Francisco, California, United States, 94199
Artificial Intelligence & Data Talent Partner at Discovered MENA
Remote Role
What you’ll be working on:
Designing and building hybrid ML models that combine supervised learning, time-series forecasting, and NLP to extract insights from unstructured data like PDFs, fund memos, and regulatory filings
Adding explainability to models using techniques like SHAP, LIME, and feature attribution so outputs are transparent and human-readable
Building scalable data pipelines across off-chain fundamentals, on-chain activity, and macro benchmarks
Integrating data from sources like FRED, PitchBook LCD, Securitize, Centrifuge, Maple, and TrueFi, with strong data lineage and freshness guarantees
Developing anomaly detection and reconciliation tools across issuer, administrator, and blockchain datasets
Creating evaluation frameworks to measure accuracy, confidence intervals, latency, and data quality
Backtesting model outputs against historical NAVs, secondary-market trades, and redemptions
Researching and incorporating credit-risk signals (CDS spreads, recovery rates, default data, etc.)
Building continuous learning loops using live market data and partner feedback
Working closely with Product and Engineering to ship models via APIs, SDKs, and dashboards used by traders, curators, and risk teams
Collaborating with data providers, protocol teams, and fund administrators to improve coverage and signal quality
Partnering with the CTO on long-term model governance, transparency, and AI ethics
What I’m looking for:
5+ years of experience in applied ML, quantitative finance, or credit-risk modeling
Strong Python and SQL skills, plus experience with ML frameworks like PyTorch, TensorFlow, scikit-learn, or XGBoost
Solid understanding of time-series forecasting, regression/classification, and probabilistic modeling
Hands-on experience with financial data (fixed income, private credit, or structured products)
Familiarity with blockchain and DeFi data, including smart contracts, token metadata, and on-chain events
Experience deploying ML models into production (APIs, orchestration, or streaming systems)
Bonus to have:
Background in credit analytics, NAV valuation, or structured credit
Experience in quant research, fintech data science, or tokenized asset analytics
Experience with NLP, vector databases, and LLMs / GenAI tools (OpenAI APIs, GPT-4, LangChain, HuggingFace, etc.)
Seniority level Mid‑Senior level
Employment type Full‑time
Job function Science and Information Technology
#J-18808-Ljbffr
What you’ll be working on:
Designing and building hybrid ML models that combine supervised learning, time-series forecasting, and NLP to extract insights from unstructured data like PDFs, fund memos, and regulatory filings
Adding explainability to models using techniques like SHAP, LIME, and feature attribution so outputs are transparent and human-readable
Building scalable data pipelines across off-chain fundamentals, on-chain activity, and macro benchmarks
Integrating data from sources like FRED, PitchBook LCD, Securitize, Centrifuge, Maple, and TrueFi, with strong data lineage and freshness guarantees
Developing anomaly detection and reconciliation tools across issuer, administrator, and blockchain datasets
Creating evaluation frameworks to measure accuracy, confidence intervals, latency, and data quality
Backtesting model outputs against historical NAVs, secondary-market trades, and redemptions
Researching and incorporating credit-risk signals (CDS spreads, recovery rates, default data, etc.)
Building continuous learning loops using live market data and partner feedback
Working closely with Product and Engineering to ship models via APIs, SDKs, and dashboards used by traders, curators, and risk teams
Collaborating with data providers, protocol teams, and fund administrators to improve coverage and signal quality
Partnering with the CTO on long-term model governance, transparency, and AI ethics
What I’m looking for:
5+ years of experience in applied ML, quantitative finance, or credit-risk modeling
Strong Python and SQL skills, plus experience with ML frameworks like PyTorch, TensorFlow, scikit-learn, or XGBoost
Solid understanding of time-series forecasting, regression/classification, and probabilistic modeling
Hands-on experience with financial data (fixed income, private credit, or structured products)
Familiarity with blockchain and DeFi data, including smart contracts, token metadata, and on-chain events
Experience deploying ML models into production (APIs, orchestration, or streaming systems)
Bonus to have:
Background in credit analytics, NAV valuation, or structured credit
Experience in quant research, fintech data science, or tokenized asset analytics
Experience with NLP, vector databases, and LLMs / GenAI tools (OpenAI APIs, GPT-4, LangChain, HuggingFace, etc.)
Seniority level Mid‑Senior level
Employment type Full‑time
Job function Science and Information Technology
#J-18808-Ljbffr