EPM Scientific
Machine Learning Engineer - Post Training
EPM Scientific, San Francisco, California, United States, 94112
Machine Learning Engineer - Post Training
Before applying for this role, please read the following information about this opportunity found below.
A stealth-stage venture backed by Lux Capital (investors in DeepMind and OpenAI) is developing frontier-scale AI systems for high-impact applications in human health and decision-making. The team is applying LLMs and multimodal agents to complex real-world problems where precision, safety, and interpretability matter.
This role focuses on post-training workflows for large models, including RLHF, reward modeling, and evaluation. Ideal candidates have experience aligning agent behavior with nuanced goals in high-stakes environments. The position involves close collaboration with research and product teams to define evaluation criteria, build scalable RL pipelines, and ship production-grade systems.
Ideal Experience: Designing and scaling RL environments for LLMs Building high-quality evaluation pipelines for frontier models Collaborating with domain experts to define evaluation tools and metrics Crafting training datasets and reward functions using LLMs and/or human feedback Training large models with RLHF, DPO, or instruction tuning Translating abstract requirements into concrete evaluation frameworks and agent behaviors
Bonus: End-to-end experience shipping ML systems in production Prior startup or zero-to-one experience Experience with PyTorch, JAX, or other modern ML frameworks Familiarity with multi-cloud infrastructure and distributed compute While familiarity with biomedical data is welcome, candidates from non-bio backgrounds are strongly encouraged to apply. The team values engineering excellence and creativity over domain-specific experience.
Before applying for this role, please read the following information about this opportunity found below.
A stealth-stage venture backed by Lux Capital (investors in DeepMind and OpenAI) is developing frontier-scale AI systems for high-impact applications in human health and decision-making. The team is applying LLMs and multimodal agents to complex real-world problems where precision, safety, and interpretability matter.
This role focuses on post-training workflows for large models, including RLHF, reward modeling, and evaluation. Ideal candidates have experience aligning agent behavior with nuanced goals in high-stakes environments. The position involves close collaboration with research and product teams to define evaluation criteria, build scalable RL pipelines, and ship production-grade systems.
Ideal Experience: Designing and scaling RL environments for LLMs Building high-quality evaluation pipelines for frontier models Collaborating with domain experts to define evaluation tools and metrics Crafting training datasets and reward functions using LLMs and/or human feedback Training large models with RLHF, DPO, or instruction tuning Translating abstract requirements into concrete evaluation frameworks and agent behaviors
Bonus: End-to-end experience shipping ML systems in production Prior startup or zero-to-one experience Experience with PyTorch, JAX, or other modern ML frameworks Familiarity with multi-cloud infrastructure and distributed compute While familiarity with biomedical data is welcome, candidates from non-bio backgrounds are strongly encouraged to apply. The team values engineering excellence and creativity over domain-specific experience.