Vmax
Member of Technical Staff - Automated Environment Design
Vmax, San Francisco, California, United States, 94199
Member of Technical Staff - Automated Environment Design
About V max
V max is an applied research lab working at the frontier of reinforcement learning (RL). We are building new techniques for leveraging RL with Large Language Models (LLMs). Our research contributes directly to our RL platform, which automates the engineering involved in converting data and evals into RL environments.
About the role To scale RL we must scale the creation of environments that are tractable for agents to learn from, and that capture the full richness and variety of the tasks an agent is expected to perform. We are looking for scientists to join us in developing this novel program of AI research - applying the principles of RL to environment generation and post-training itself.
Responsibilities
Develop optimization-based methods for automatically generating RL environments
Establish normative baselines for measuring the quality of RL environments
Create infrastructure to reliably generate environments that incorporate historical data and agentic evals
Own and develop a research agenda within V max
PhD or equivalent experience in ML
Track record of research excellence, as demonstrated by publications, open source work or publicly deployed AI systems
Deep understanding of RL and ML
Expertise with Python and an ML framework (PyTorch, JAX)
Nice to have
Experience with LLM post-training
Demonstrated software engineering experience
Skilled in presenting the results and implications of your work to multiple audiences
Role specific location policy
This role is based in our San Francisco office; for exceptional candidates we are willing to consider a hybrid arrangement
Compensation The expected salary range for this position is $250,000 - $450,000 USD
#J-18808-Ljbffr
About the role To scale RL we must scale the creation of environments that are tractable for agents to learn from, and that capture the full richness and variety of the tasks an agent is expected to perform. We are looking for scientists to join us in developing this novel program of AI research - applying the principles of RL to environment generation and post-training itself.
Responsibilities
Develop optimization-based methods for automatically generating RL environments
Establish normative baselines for measuring the quality of RL environments
Create infrastructure to reliably generate environments that incorporate historical data and agentic evals
Own and develop a research agenda within V max
PhD or equivalent experience in ML
Track record of research excellence, as demonstrated by publications, open source work or publicly deployed AI systems
Deep understanding of RL and ML
Expertise with Python and an ML framework (PyTorch, JAX)
Nice to have
Experience with LLM post-training
Demonstrated software engineering experience
Skilled in presenting the results and implications of your work to multiple audiences
Role specific location policy
This role is based in our San Francisco office; for exceptional candidates we are willing to consider a hybrid arrangement
Compensation The expected salary range for this position is $250,000 - $450,000 USD
#J-18808-Ljbffr