Madiff

Data Scientist

Madiff, Poland, New York, United States

At Madiff, we design and deliver advanced AI solutions for top US enterprises, building large‑scale commercial systems powered by Generative AI and LLMs. Our teams in Poland, Portugal, France, and the UK collaborate with global consulting firms and Fortune 500 clients across banking, telecom, and high‑tech industries.

We are expanding our Data Science practice to strengthen our capability in developing and deploying production‑grade AI systems. This role focuses on creating LLM‑based architectures, optimising data pipelines, and applying advanced NLP, RAG, and predictive analytics in enterprise environments.

You will work at the intersection of data science and AI engineering — designing intelligent systems, orchestrating models with LangChain and LangGraph, and supporting the industrialisation of GenAI solutions across complex business platforms.

Key Responsibilities

Design, fine‑tune, and optimise LLMs and GenAI pipelines using Python, PyTorch, TensorFlow, MLflow, LangChain, and LangGraph

Build and scale RAG systems, embeddings, and vector‑based retrieval (Pinecone, FAISS, Chroma)

Develop and deploy predictive models for classification and forecasting within CI/CD environments using Docker and MLflow

Integrate LLMs with APIs and business systems to deliver automation and decision intelligence at scale

Implement AI governance frameworks including model monitoring, compliance, and explainability

Lead internal adoption workshops, mentor junior team members, and drive delivery excellence

Requirements

3–4+ years of experience as a Data Scientist with practical exposure to Generative AI and LLM systems

Advanced Python programming with strong knowledge of PyTorch, TensorFlow, Scikit‑learn, Pandas, and NumPy

Proven experience with LangChain, LangGraph, or similar orchestration frameworks

Understanding of RAG architectures, embeddings, and vector databases (Pinecone, FAISS, Weaviate, Milvus)

Familiarity with cloud platforms (Azure, Databricks, AWS) and MLOps tools (MLflow, Docker, CI/CD)

Solid foundation in predictive modelling, NLP, and statistical analysis

Fluent English (spoken and written)

Nice to Have

Experience with Dataiku, Apache Spark, or Streamlit for experimentation and visualisation

Domain background in financial services, healthcare, or enterprise SaaS

MSc or PhD in Data Science, AI, or related field

Fully remote collaboration model

#J-18808-Ljbffr