Vmax
Member of Technical Staff - Open Endedness
Vmax, San Francisco, California, United States, 94199
Member of Technical Staff - Open Endedness
About V max
V max is an applied research lab working at the frontier of reinforcement learning (RL). We are building new techniques for leveraging RL with Large Language Models (LLMs). Our research contributes directly to our RL platform, which automates the engineering involved in converting data and evals into RL environments.
About the role Our goal is to automate the design of tasks for RL agents to help them learn domain specific skills. In this role you will develop approaches to optimize the construction of tasks to maximize resulting agent performance.
Responsibilities
Develop new approaches to task construction - building on literature in open endedness and unsupervised environment design
Develop new reward functions for environment design
Benchmark agents that learn in generated environments
Validate your research on industry specific problems
Role Requirements
AI PhD or equivalent experience
Track record of research excellence, as demonstrated by publications, open source work or publicly deployed AI systems
Deep understanding of RL and ML
Expertise with Python and an ML framework (PyTorch, JAX)
Nice to have
Experience in post-training LLMs
Experience researching evolutionary optimization
Experience researching unsupervised environment design
Skilled in presenting the results and implications of your work to multiple levels of audience
Role specific location policy
this role is based in our San Francisco office; for exceptional candidates we are willing to consider a hybrid arrangement
Compensation The expected salary range for this position is $250,000 - $450,000 USD
#J-18808-Ljbffr
About the role Our goal is to automate the design of tasks for RL agents to help them learn domain specific skills. In this role you will develop approaches to optimize the construction of tasks to maximize resulting agent performance.
Responsibilities
Develop new approaches to task construction - building on literature in open endedness and unsupervised environment design
Develop new reward functions for environment design
Benchmark agents that learn in generated environments
Validate your research on industry specific problems
Role Requirements
AI PhD or equivalent experience
Track record of research excellence, as demonstrated by publications, open source work or publicly deployed AI systems
Deep understanding of RL and ML
Expertise with Python and an ML framework (PyTorch, JAX)
Nice to have
Experience in post-training LLMs
Experience researching evolutionary optimization
Experience researching unsupervised environment design
Skilled in presenting the results and implications of your work to multiple levels of audience
Role specific location policy
this role is based in our San Francisco office; for exceptional candidates we are willing to consider a hybrid arrangement
Compensation The expected salary range for this position is $250,000 - $450,000 USD
#J-18808-Ljbffr