Storm3

Research Scientist - Vision, Multimodality

Storm3, San Francisco, California, United States, 94199

Research Scientist – Vision, Multimodality Come join one of the only research institutions globally with resources to compete with top AI companies—10s of thousands of GPUs to explore state-of-the-art research in LLMs, multimodal, and agentic AI.

Currently seeking AI talent with expertise in multimodal AI, video/image generation, diffusion models, and visual understanding to develop the next breakthrough in physical and generative machine intelligence.

Base pay range $250,000.00/yr – $400,000.00/yr

⚡ Research Scientists/Engineers (all levels)

Responsibilities

Research & develop novel methods in Video/Image Generation and Multimodality

Contribute towards large-scale Vision data infrastructure and model training

Source, process and curate high quality image and video data

Publish breakthrough findings and reports in top conferences

Requirements

PhD in Comp Sci/ML/Maths preferred

Strong publication record in top conferences, or contribution to leading AI models—particularly in controllable generative models, world models or physical AI

2+ years industry experience focused on VLMs, diffusion, visual understanding, 2D image/video

Strong in navigating ambiguity and impacting research direction

Why apply

Opportunity to join a fast-growing core team that are already pushing AI breakthroughs

Highly competitive salary package

Work alongside ambitious and bright superstars from tech and academia

Medical, Dental and Vision Insurance

Interested in applying? Please click on the ‘Easy Apply’ button or alternatively email me your resume at

anir.gantugs@storm3.com

#J-18808-Ljbffr