Lila Sciences
Lila Sciences is the worlds first scientific superintelligence platform and autonomous lab for life, chemistry, and materials science. We are pioneering a new age of boundless discovery by applying AI to every aspect of the scientific method and solving humankind's greatest challenges in health, climate, and sustainability.
At Lila, we are uniquely cross-functional and collaborative. We seek individuals with an inclusive mindset and a diversity of thought. Our teams thrive in unstructured and creative environments where every voice is heard.
If this sounds like an environment youd love to work in, please apply.
Overview
As a Data Scientist in our Physical Sciences organization, you will transform complex experimental and testing datasets into actionable insights that drive our autonomous labs decision-making. Youll partner with electrochemists, synthesis chemists, characterization specialists, and automation engineers to ensure data quality, build predictive models, and inform scientific campaigns across materials and device development. Responsibilities
Data Infrastructure : Design and maintain robust ETL pipelines to ingest, validate, and preprocess data from diverse sourceselectrochemical tests, materials characterization, and automated lab instruments. Feature Engineering & Modeling : Perform domain-relevant data transformations, extract meaningful descriptors from raw data (e.g., voltage curves, spectroscopic signatures, image-based measurements) and develop statistical or machine learning models to relate independent variables (time, composition, etc.) to performance metrics and failure modes. Analytics & Visualization : Create interactive dashboards and reports to communicate trends, anomalies, and key insights to scientific and engineering teams. Active Learning Support : Collaborate with ML scientists to integrate analytical outputs into active learning loops, helping to prioritize experiments and optimize resource allocation. Cross-Functional Partnership : Work closely with R&D leadership, Product Managers, and automation specialists to translate scientific questions into data requirements and modeling strategies. Reproducibility & Documentation : Establish best practices for code versioning, data provenance, and analysis notebooks; contribute to internal knowledge bases and publications. What Youll Need to Succeed
Masters or Ph.D. in Data Science, Statistics, Materials Science, Chemistry, Physics, or a related quantitative field. 2+ years of experience in data analysis, statistical modeling, or machine learningideally applied to physical sciences or engineering datasets. Proficiency in Python (pandas, NumPy, scikit-learn) and SQL for data manipulation and analysis. Hands-on experience building ETL workflows using tools like Airflow, Prefect, or similar. Strong foundation in experimental design, statistical inference, and multivariate analysis. Familiarity with data visualization libraries (Plotly, Dash, or similar) and dashboard frameworks. Experience working with electrochemical or materials characterization data (e.g., impedance spectroscopy, X-ray diffraction, electron microscopy). Materials-specific python libraries (pymatgen) Exposure to cloud-based data platforms (AWS, GCP, or Azure) and scalable storage solutions. Knowledge of containerization (Docker, Singularity) and workflow orchestration (Snakemake, Nextflow). Prior contributions to open-source data tools or scientific software. Understanding of active learning, Bayesian optimization, or uncertainty quantification in experimental contexts. Equal Employment Opportunity
Lila Sciences is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. Seniority level
Mid-Senior level Employment type
Full-time Job function
Research and Science Industries
Technology, Information and Internet
#J-18808-Ljbffr
As a Data Scientist in our Physical Sciences organization, you will transform complex experimental and testing datasets into actionable insights that drive our autonomous labs decision-making. Youll partner with electrochemists, synthesis chemists, characterization specialists, and automation engineers to ensure data quality, build predictive models, and inform scientific campaigns across materials and device development. Responsibilities
Data Infrastructure : Design and maintain robust ETL pipelines to ingest, validate, and preprocess data from diverse sourceselectrochemical tests, materials characterization, and automated lab instruments. Feature Engineering & Modeling : Perform domain-relevant data transformations, extract meaningful descriptors from raw data (e.g., voltage curves, spectroscopic signatures, image-based measurements) and develop statistical or machine learning models to relate independent variables (time, composition, etc.) to performance metrics and failure modes. Analytics & Visualization : Create interactive dashboards and reports to communicate trends, anomalies, and key insights to scientific and engineering teams. Active Learning Support : Collaborate with ML scientists to integrate analytical outputs into active learning loops, helping to prioritize experiments and optimize resource allocation. Cross-Functional Partnership : Work closely with R&D leadership, Product Managers, and automation specialists to translate scientific questions into data requirements and modeling strategies. Reproducibility & Documentation : Establish best practices for code versioning, data provenance, and analysis notebooks; contribute to internal knowledge bases and publications. What Youll Need to Succeed
Masters or Ph.D. in Data Science, Statistics, Materials Science, Chemistry, Physics, or a related quantitative field. 2+ years of experience in data analysis, statistical modeling, or machine learningideally applied to physical sciences or engineering datasets. Proficiency in Python (pandas, NumPy, scikit-learn) and SQL for data manipulation and analysis. Hands-on experience building ETL workflows using tools like Airflow, Prefect, or similar. Strong foundation in experimental design, statistical inference, and multivariate analysis. Familiarity with data visualization libraries (Plotly, Dash, or similar) and dashboard frameworks. Experience working with electrochemical or materials characterization data (e.g., impedance spectroscopy, X-ray diffraction, electron microscopy). Materials-specific python libraries (pymatgen) Exposure to cloud-based data platforms (AWS, GCP, or Azure) and scalable storage solutions. Knowledge of containerization (Docker, Singularity) and workflow orchestration (Snakemake, Nextflow). Prior contributions to open-source data tools or scientific software. Understanding of active learning, Bayesian optimization, or uncertainty quantification in experimental contexts. Equal Employment Opportunity
Lila Sciences is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. Seniority level
Mid-Senior level Employment type
Full-time Job function
Research and Science Industries
Technology, Information and Internet
#J-18808-Ljbffr