ByteDance
Student Researcher [Seed LLM - Scientific Coding Agent (LLMs × Physics/Chemistry
ByteDance, San Jose, California, United States, 95199
Student Researcher [Seed LLM - Scientific Coding Agent (LLMs × Physics/Chemistry; RL preferred)] – 2026 Start (PhD)
Join ByteDance to develop a scientific coding agent that plans, writes, debugs, and executes scientific code to accelerate physics, chemistry, and biology research.
Responsibilities
Benchmark scientific coding agents and curate a principled benchmark suite spanning coding and problem‑solving tasks.
Curate datasets, write robust unit tests, and implement evaluation metrics to build a clean, extensible evaluation pipeline with baselines.
Prototype loops where agents propose and refine scientific language and tie these to executable code & simulations.
Build workflows (multi‑agent planning, tool‑use, self‑critique), integrate code execution sandboxes, retrieval, and experiment runners.
Compare LLMs and inference strategies; run ablations and produce clean research artifacts (plots, tables, write‑ups).
Qualifications
Currently pursuing a PhD in Computer Science, Machine Learning, Programming Systems, Physics, Chemistry, Biology, Applied Math, or related field.
Strong Python and software engineering skills: Git, testing (pytest), packaging, Linux/containers.
Working knowledge of LLMs (prompting, fine‑tuning, adapters, evaluation) and at least one ML framework (e.g., PyTorch).
Solid foundations in physics and/or chemistry (classical/quantum/thermo/stat mech, physical chemistry, molecular modeling) and numerical methods (ODE/PDE, optimization, linear algebra).
Preferred: Experience with reinforcement learning (PPO/DPO, reward modeling, curriculum learning).
Benefits Internship with day‑one access to health insurance, life insurance, wellbeing benefits, 10 paid holidays, and paid sick time. Potential housing allowance for non‑remote positions.
#J-18808-Ljbffr
Responsibilities
Benchmark scientific coding agents and curate a principled benchmark suite spanning coding and problem‑solving tasks.
Curate datasets, write robust unit tests, and implement evaluation metrics to build a clean, extensible evaluation pipeline with baselines.
Prototype loops where agents propose and refine scientific language and tie these to executable code & simulations.
Build workflows (multi‑agent planning, tool‑use, self‑critique), integrate code execution sandboxes, retrieval, and experiment runners.
Compare LLMs and inference strategies; run ablations and produce clean research artifacts (plots, tables, write‑ups).
Qualifications
Currently pursuing a PhD in Computer Science, Machine Learning, Programming Systems, Physics, Chemistry, Biology, Applied Math, or related field.
Strong Python and software engineering skills: Git, testing (pytest), packaging, Linux/containers.
Working knowledge of LLMs (prompting, fine‑tuning, adapters, evaluation) and at least one ML framework (e.g., PyTorch).
Solid foundations in physics and/or chemistry (classical/quantum/thermo/stat mech, physical chemistry, molecular modeling) and numerical methods (ODE/PDE, optimization, linear algebra).
Preferred: Experience with reinforcement learning (PPO/DPO, reward modeling, curriculum learning).
Benefits Internship with day‑one access to health insurance, life insurance, wellbeing benefits, 10 paid holidays, and paid sick time. Potential housing allowance for non‑remote positions.
#J-18808-Ljbffr