Bosch USA

Research Scientist- Vision-Language-Action Models for Autonomous Systems

Bosch USA, Sunnyvale, California, United States, 94087

Overview

Research Scientist position focused on Vision-Language-Action Models for Autonomous Systems at Bosch USA. The role involves advancing Embodied AI for AIoT applications in autonomous driving, robotics, and related domains, collaborating across global teams to transfer research into Bosch’s platforms. Responsibilities

Conduct research and engineering in core AI and machine learning fields to enable Embodied AI (including computer vision, autonomous planning, open-world learning, and related areas) for AIoT business domains such as autonomous driving, industrial automation, and robotics. Push the boundaries in end-to-end perception and planning for automated driving, incorporating advances in vision-language-action models to aid reasoning and explainability. Collaborate with a global team to transfer cutting-edge research findings to Bosch's operational units. Implement research results to solve real-world challenges and ensure high-quality system integration within Bosch's platforms. Stay current with the latest technological advancements and market trends by attending academic conferences, technical events, and seminars. Document and disseminate research findings through high-caliber publications and/or patent submissions. Qualifications

Basic Qualifications

Ph.D. in Computer Science, Robotics or related discipline, or Master’s degree with at least 1–3 years of industry experience after graduation. Minimum 3 years of R&D experience in AI technologies including computer vision and robotic or automotive motion and behavioral planning, or an equivalent graduate research background. Proficiency in one or more ML programming languages (e.g., Python, C++, Rust). Strong interpersonal, communication, and teamwork skills. Knowledge of major ML frameworks like TensorFlow or PyTorch. Hands-on experience building multimodal transformer-based sequence-to-sequence models. Familiarity with vision-language-action model concepts such as MoE, GRPO, LoRA, etc. Preferred Qualifications

Experience with real-world product development and deployment of autonomous systems. Hands-on experience in computer vision and deep learning in areas such as multimodal transformers, multimodal language models, diffusion models, NeRF, Gaussian splatting, object detection/segmentation, 3D scene understanding, sensor calibration, SfM, voxel/BEV grid features. Strong publication record in premier ML, DL, robotics, and CV venues. Benefits

We offer a competitive base salary in California with a range of $160,000–$200,000, plus an annual corporate bonus and a long-term incentive bonus. Salary is determined by experience, knowledge, role complexity, and location. We provide a comprehensive benefits package including premium health coverage, a 401(k) with generous matching, financial planning resources, ample paid time off, parental leave, and life and disability protection. More details are available from the recruiter during the interview process. Equal Opportunity Employer, including disability and veterans. Employment is contingent upon successful drug screening and background checks.

#J-18808-Ljbffr