NVIDIA
Multimodal Deep Learning Solution Architect - Vision Language and Action Models
NVIDIA, Italy, New York, United States
Overview
Multimodal Deep Learning Solution Architect - Vision Language and Action Models role at NVIDIA. NVIDIA’s Worldwide Field Operations (WWFO) team seeks a Solutions Architect with expertise in Multimodal Deep Learning, Vision-Language Models (VLMs), and a deep understanding of their implications for physical AI in robotics, manufacturing, and healthcare by combining perception and language with decision-making. In this role, you will operate at the intersection of AI research and real-world applications, serving as the primary technical expert for NVIDIA customers. You will develop proof-of-concept solutions, demonstrate modern neural network architectures, and advance how customers leverage multimodal reasoning for robotics and autonomous systems. This role involves close collaboration with developers, data scientists, IT managers, and senior executives. The ideal candidate has deep knowledge in vision-language-action reasoning, large-scale pretraining, data curation, and post-training using supervised fine-tuning and reinforcement learning, with expertise in neural network optimization. Experience applying VLM to Physical AI use cases is a nice-to-have.
Responsibilities
Serve as the primary technical expert between NVIDIA and customers, understanding their technology and providing AI solutions/guidance on training processes, tools, and methodology.
Build proof-of-concepts and demonstrations that highlight the power of NVIDIA AI platforms for Vision Language Reasoning Models.
Partner with developers, researchers, technology specialists, IT professionals, and executives to facilitate the integration of NVIDIA technology.
Collaborate with Engineering, Product and Sales teams to develop and plan the best suitable solutions for customers; enable development and growth of product features through customer feedback and proof-of-concept evaluations.
What We Need To See
MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields.
Deep expertise in AI/Deep Learning with hands-on experience in training or optimizing VLMs for production.
Expertise with deep learning frameworks for training VLMs (PyTorch, Nemo) and/or experience with optimization tools (TensorRT and Triton Inference Server).
Excellent verbal, written communication, and technical presentation skills in English.
5+ years’ work or research experience with Python/C++ or other software development languages.
AI passion with a growth mindset and ability to collaborate effectively with Engineering, Product, Sales, and Marketing in a rapidly evolving environment while continuously learning and sharing insights.
Ways To Stand Out From The Crowd
Familiarity with Cosmos-Reason and Isaac GR00T
Track record in running large-scale training and customization of VLMs
Track record in Neural Networks inference optimization for Physical AI use cases
Benefits and Equality NVIDIA offers highly competitive salaries and a comprehensive benefits package. NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. We value diversity in our current and future employees and do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.
Referrals increase your chances of interviewing at NVIDIA.
JR2004760
#J-18808-Ljbffr
Multimodal Deep Learning Solution Architect - Vision Language and Action Models role at NVIDIA. NVIDIA’s Worldwide Field Operations (WWFO) team seeks a Solutions Architect with expertise in Multimodal Deep Learning, Vision-Language Models (VLMs), and a deep understanding of their implications for physical AI in robotics, manufacturing, and healthcare by combining perception and language with decision-making. In this role, you will operate at the intersection of AI research and real-world applications, serving as the primary technical expert for NVIDIA customers. You will develop proof-of-concept solutions, demonstrate modern neural network architectures, and advance how customers leverage multimodal reasoning for robotics and autonomous systems. This role involves close collaboration with developers, data scientists, IT managers, and senior executives. The ideal candidate has deep knowledge in vision-language-action reasoning, large-scale pretraining, data curation, and post-training using supervised fine-tuning and reinforcement learning, with expertise in neural network optimization. Experience applying VLM to Physical AI use cases is a nice-to-have.
Responsibilities
Serve as the primary technical expert between NVIDIA and customers, understanding their technology and providing AI solutions/guidance on training processes, tools, and methodology.
Build proof-of-concepts and demonstrations that highlight the power of NVIDIA AI platforms for Vision Language Reasoning Models.
Partner with developers, researchers, technology specialists, IT professionals, and executives to facilitate the integration of NVIDIA technology.
Collaborate with Engineering, Product and Sales teams to develop and plan the best suitable solutions for customers; enable development and growth of product features through customer feedback and proof-of-concept evaluations.
What We Need To See
MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields.
Deep expertise in AI/Deep Learning with hands-on experience in training or optimizing VLMs for production.
Expertise with deep learning frameworks for training VLMs (PyTorch, Nemo) and/or experience with optimization tools (TensorRT and Triton Inference Server).
Excellent verbal, written communication, and technical presentation skills in English.
5+ years’ work or research experience with Python/C++ or other software development languages.
AI passion with a growth mindset and ability to collaborate effectively with Engineering, Product, Sales, and Marketing in a rapidly evolving environment while continuously learning and sharing insights.
Ways To Stand Out From The Crowd
Familiarity with Cosmos-Reason and Isaac GR00T
Track record in running large-scale training and customization of VLMs
Track record in Neural Networks inference optimization for Physical AI use cases
Benefits and Equality NVIDIA offers highly competitive salaries and a comprehensive benefits package. NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. We value diversity in our current and future employees and do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.
Referrals increase your chances of interviewing at NVIDIA.
JR2004760
#J-18808-Ljbffr