Toyota Research Institute
Research Scientist, Robotics VLAs Post-Training and Adaptation
Toyota Research Institute, Los Altos, California, United States, 94024
At Toyota Research Institute (TRI), we’re on a mission to improve the quality of human life. We’re developing new tools and capabilities to amplify the human experience. To lead this transformative shift in mobility, we’ve built a world-class team in Energy & Materials, Human-Centered AI, Human-Interactive Driving, Automated Driving, and Robotics.
Overview We are seeking a creative and technically strong researcher to advance post-training methods for
Vision-Language-Action (VLA)
models in robotics. This role focuses on improving model alignment, robustness, and adaptability in real-world robotic settings through advanced post-training and continual learning techniques. You will develop algorithms and frameworks that enable persistent learning and optimize data efficiency in embodied systems.
Responsibilities
Post-training and adaptation:
Design and implement post-training pipelines for VLA models using techniques such as
reinforcement learning (RL), reinforcement learning from human or preference feedback (RLHF/RLAIF)
,
in-context learning.
Experience with real-world RL is a plus!
Sim-to-real transfer:
Develop methods to enhance real-world transferability of policies trained in simulation.
Reset-free and continual learning:
Explore and implement
reset-free
and
autonomous data collection
strategies that enable continual skill improvement without manual resets or supervision. Learn continually under settings with large-scale, long term data collection.
Structured exploration:
Investigate exploration algorithms that balance safety, curiosity, and efficiency for data gathering in both simulation and real-world robotic systems.
Data curation and feedback loops:
Lead the design of data collection and curation pipelines for exploration and post-training, using multimodal data from demonstrations, teleoperation, and on-policy rollouts.
Collaborate across teams in
perception, control, and ML infrastructure
to deploy scalable and reproducible research systems.
Publish research outcomes and contribute to the open robotics and embodied AI communities.
Qualifications
Ph.D. or M.S. in
Robotics, Machine Learning, Computer Vision, or related field , or equivalent applied research experience.
Expertise in
reinforcement learning, imitation learning, and multimodal representation learning .
Strong proficiency with
deep learning frameworks
(e.g., PyTorch, JAX) and robotics simulation environments (e.g., MuJoCo, IsaacSim, PyBullet, Habitat).
Experience with
sim-to-real transfer
,
policy adaptation
, or
continual learning
in embodied settings.Strong coding and experimental skills with an emphasis on reproducibility and evaluation at scale.
Prior robotics experience with real-world hardware and ML-based robot deployments.
Bonus Qualifications
Prior work on
VLA models
(e.g., PI0/PI0.5, OpenVLA, custom models).
Experience building or managing
robot data collection infrastructure .
Familiarity with
real-world robot platforms
(e.g., Franka, Humanoids, or mobile manipulators).
Publications in top-tier conferences (CoRL, RSS, NeurIPS, ICLR, ICML, ICRA, CVPR).
The pay range for this position at commencement of employment is expected to be between $176,000 and $264,000/year for California-based roles. Base pay offered will depend on multiple individualized factors, including, but not limited to, business or organizational needs, market location, job-related knowledge, skills, and experience. TRI offers a generous benefits package including medical, dental, and vision insurance, 401(k) eligibility, paid time off benefits (including vacation, sick time, and parental leave), and an annual cash bonus structure. Additional details regarding these benefit plans will be provided if an employee receives an offer of employment.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
#J-18808-Ljbffr
Overview We are seeking a creative and technically strong researcher to advance post-training methods for
Vision-Language-Action (VLA)
models in robotics. This role focuses on improving model alignment, robustness, and adaptability in real-world robotic settings through advanced post-training and continual learning techniques. You will develop algorithms and frameworks that enable persistent learning and optimize data efficiency in embodied systems.
Responsibilities
Post-training and adaptation:
Design and implement post-training pipelines for VLA models using techniques such as
reinforcement learning (RL), reinforcement learning from human or preference feedback (RLHF/RLAIF)
,
in-context learning.
Experience with real-world RL is a plus!
Sim-to-real transfer:
Develop methods to enhance real-world transferability of policies trained in simulation.
Reset-free and continual learning:
Explore and implement
reset-free
and
autonomous data collection
strategies that enable continual skill improvement without manual resets or supervision. Learn continually under settings with large-scale, long term data collection.
Structured exploration:
Investigate exploration algorithms that balance safety, curiosity, and efficiency for data gathering in both simulation and real-world robotic systems.
Data curation and feedback loops:
Lead the design of data collection and curation pipelines for exploration and post-training, using multimodal data from demonstrations, teleoperation, and on-policy rollouts.
Collaborate across teams in
perception, control, and ML infrastructure
to deploy scalable and reproducible research systems.
Publish research outcomes and contribute to the open robotics and embodied AI communities.
Qualifications
Ph.D. or M.S. in
Robotics, Machine Learning, Computer Vision, or related field , or equivalent applied research experience.
Expertise in
reinforcement learning, imitation learning, and multimodal representation learning .
Strong proficiency with
deep learning frameworks
(e.g., PyTorch, JAX) and robotics simulation environments (e.g., MuJoCo, IsaacSim, PyBullet, Habitat).
Experience with
sim-to-real transfer
,
policy adaptation
, or
continual learning
in embodied settings.Strong coding and experimental skills with an emphasis on reproducibility and evaluation at scale.
Prior robotics experience with real-world hardware and ML-based robot deployments.
Bonus Qualifications
Prior work on
VLA models
(e.g., PI0/PI0.5, OpenVLA, custom models).
Experience building or managing
robot data collection infrastructure .
Familiarity with
real-world robot platforms
(e.g., Franka, Humanoids, or mobile manipulators).
Publications in top-tier conferences (CoRL, RSS, NeurIPS, ICLR, ICML, ICRA, CVPR).
The pay range for this position at commencement of employment is expected to be between $176,000 and $264,000/year for California-based roles. Base pay offered will depend on multiple individualized factors, including, but not limited to, business or organizational needs, market location, job-related knowledge, skills, and experience. TRI offers a generous benefits package including medical, dental, and vision insurance, 401(k) eligibility, paid time off benefits (including vacation, sick time, and parental leave), and an annual cash bonus structure. Additional details regarding these benefit plans will be provided if an employee receives an offer of employment.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
#J-18808-Ljbffr