OCBridge
About the Role
We are seeking talented Machine Learning Engineers specializing in Reinforcement Learning (RL) to develop our end-to-end autonomous driving model. You will leverage both RLHF and closed-loop RL in scalable simulators to improve our model. Working with massive real-world driving scenarios, you will enhance safety, comfort, and user experience in autonomous vehicles deployed at scale. Basic Requirements
Master’s or PhD in Machine Learning, Robotics, or related field 2–4 years of RL experience (or strong internship background for new grads) Core Technical Skills
Proficiency in modern RL algorithms: DQN, PPO, SAC, TD3, etc. Experience with multi-agent RL systems Proficiency in modern RLHF algorithms: PPO, DPO, GRPO, etc. Hands-on experience training reward models and fine-tuning LLM/VLM/VLA Knowledge of distributed RL training at scale Simulation
Proficiency with massively parallel simulation environments Knowledge of sim-to-real transfer techniques and domain randomization Deep Learning & Engineering
Proficiency in Python, comfortable with C++ Expertise in PyTorch Experience with distributed training frameworks (Ray, Horovod, etc.) Knowledge of model optimization (quantization, pruning) and CUDA is a plus Autonomous Driving Domain Knowledge
Understanding of AV system architecture (perception, planning, control) Experience with AV simulation platforms: CARLA, AirSim, SUMO, or similar Knowledge of traffic rules and driving behavior modeling Key Responsibilities
Implement RLHF algorithms to align our end-to-end driving model with human preferences and safety criteria Develop reward models that capture nuanced driving behaviors from real-world scenarios Develop closed-loop RL training systems that scale to millions of simulation episodes per second Create sim-to-real transfer strategies bridging simulation training and real-world performance Collaborate with cross-functional teams for end-to-end system integration in both simulation and onboard deployment Preferred Qualifications
Open-source contributions to RL libraries or autonomous driving projects Previous experience with LLM fine-tuning using RLHF Knowledge of safe RL, interpretable AI, or robustness techniques Familiarity with autonomous vehicle regulations and safety standards For New Graduates
We welcome exceptional new graduates with: Strong academic background in RL/ML with relevant research projects Experience with simulation environments and distributed computing Ability to implement complex algorithms from scratch Portfolio of projects showcasing RL applications What We Offer
Competitive compensation package with equity options Professional development budget for conferences and continued learning Seniority level
Entry level Employment type
Full-time Job function
Engineering and Information Technology Industries: Human Resources Services Referrals increase your chances of interviewing at OCBridge by 2x
#J-18808-Ljbffr
We are seeking talented Machine Learning Engineers specializing in Reinforcement Learning (RL) to develop our end-to-end autonomous driving model. You will leverage both RLHF and closed-loop RL in scalable simulators to improve our model. Working with massive real-world driving scenarios, you will enhance safety, comfort, and user experience in autonomous vehicles deployed at scale. Basic Requirements
Master’s or PhD in Machine Learning, Robotics, or related field 2–4 years of RL experience (or strong internship background for new grads) Core Technical Skills
Proficiency in modern RL algorithms: DQN, PPO, SAC, TD3, etc. Experience with multi-agent RL systems Proficiency in modern RLHF algorithms: PPO, DPO, GRPO, etc. Hands-on experience training reward models and fine-tuning LLM/VLM/VLA Knowledge of distributed RL training at scale Simulation
Proficiency with massively parallel simulation environments Knowledge of sim-to-real transfer techniques and domain randomization Deep Learning & Engineering
Proficiency in Python, comfortable with C++ Expertise in PyTorch Experience with distributed training frameworks (Ray, Horovod, etc.) Knowledge of model optimization (quantization, pruning) and CUDA is a plus Autonomous Driving Domain Knowledge
Understanding of AV system architecture (perception, planning, control) Experience with AV simulation platforms: CARLA, AirSim, SUMO, or similar Knowledge of traffic rules and driving behavior modeling Key Responsibilities
Implement RLHF algorithms to align our end-to-end driving model with human preferences and safety criteria Develop reward models that capture nuanced driving behaviors from real-world scenarios Develop closed-loop RL training systems that scale to millions of simulation episodes per second Create sim-to-real transfer strategies bridging simulation training and real-world performance Collaborate with cross-functional teams for end-to-end system integration in both simulation and onboard deployment Preferred Qualifications
Open-source contributions to RL libraries or autonomous driving projects Previous experience with LLM fine-tuning using RLHF Knowledge of safe RL, interpretable AI, or robustness techniques Familiarity with autonomous vehicle regulations and safety standards For New Graduates
We welcome exceptional new graduates with: Strong academic background in RL/ML with relevant research projects Experience with simulation environments and distributed computing Ability to implement complex algorithms from scratch Portfolio of projects showcasing RL applications What We Offer
Competitive compensation package with equity options Professional development budget for conferences and continued learning Seniority level
Entry level Employment type
Full-time Job function
Engineering and Information Technology Industries: Human Resources Services Referrals increase your chances of interviewing at OCBridge by 2x
#J-18808-Ljbffr