Toyota Research Institute

Research Scientist, Robotics VLAs Post-Training and Adaptation

Toyota Research Institute, Los Altos, California, United States, 94024

At Toyota Research Institute (TRI), we’re on a mission to improve the quality of human life. We’re developing new tools and capabilities to amplify the human experience. To lead this transformative shift in mobility, we’ve built a world-class team in Energy & Materials, Human-Centered AI, Human-Interactive Driving, Automated Driving, and Robotics.

Overview We are seeking a creative and technically strong researcher to advance post-training methods for

Vision-Language-Action (VLA)

models in robotics. This role focuses on improving model alignment, robustness, and adaptability in real-world robotic settings through advanced post-training and continual learning techniques. You will develop algorithms and frameworks that enable persistent learning and optimize data efficiency in embodied systems.

Responsibilities

Post-training and adaptation:

Design and implement post-training pipelines for VLA models using techniques such as

reinforcement learning (RL), reinforcement learning from human or preference feedback (RLHF/RLAIF)

,

in-context learning.

Experience with real-world RL is a plus!

Sim-to-real transfer:

Develop methods to enhance real-world transferability of policies trained in simulation.

Reset-free and continual learning:

Explore and implement

reset-free

and

autonomous data collection

strategies that enable continual skill improvement without manual resets or supervision. Learn continually under settings with large-scale, long term data collection.

Structured exploration:

Investigate exploration algorithms that balance safety, curiosity, and efficiency for data gathering in both simulation and real-world robotic systems.

Data curation and feedback loops:

Lead the design of data collection and curation pipelines for exploration and post-training, using multimodal data from demonstrations, teleoperation, and on-policy rollouts.

Collaborate across teams in

perception, control, and ML infrastructure

to deploy scalable and reproducible research systems.

Publish research outcomes and contribute to the open robotics and embodied AI communities.

Qualifications

Ph.D. or M.S. in

Robotics, Machine Learning, Computer Vision, or related field , or equivalent applied research experience.

Expertise in

reinforcement learning, imitation learning, and multimodal representation learning .

Strong proficiency with

deep learning frameworks

(e.g., PyTorch, JAX) and robotics simulation environments (e.g., MuJoCo, IsaacSim, PyBullet, Habitat).

Experience with

sim-to-real transfer

,

policy adaptation

, or

continual learning

in embodied settings.Strong coding and experimental skills with an emphasis on reproducibility and evaluation at scale.

Prior robotics experience with real-world hardware and ML-based robot deployments.

Bonus Qualifications

Prior work on

VLA models

(e.g., PI0/PI0.5, OpenVLA, custom models).

Experience building or managing

robot data collection infrastructure .

Familiarity with

real-world robot platforms

(e.g., Franka, Humanoids, or mobile manipulators).

Publications in top-tier conferences (CoRL, RSS, NeurIPS, ICLR, ICML, ICRA, CVPR).

The pay range for this position at commencement of employment is expected to be between $176,000 and $264,000/year for California-based roles. Base pay offered will depend on multiple individualized factors, including, but not limited to, business or organizational needs, market location, job-related knowledge, skills, and experience. TRI offers a generous benefits package including medical, dental, and vision insurance, 401(k) eligibility, paid time off benefits (including vacation, sick time, and parental leave), and an annual cash bonus structure. Additional details regarding these benefit plans will be provided if an employee receives an offer of employment.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

#J-18808-Ljbffr