Boson AI
Be among the first 25 applicants
This range is provided by Boson AI. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.
Base pay range $150,000.00/yr - $600,000.00/yr
Boson AI is an early‑stage startup building large language tools for everyone to use. Our founders (Alex Smola, Mu Li) and a team of Deep Learning, Optimization, NLP, AutoML and Statistics scientists and engineers are working on high‑quality generative AI models for language and beyond.
We are seeking research scientists and engineers to join our team full‑time in our Santa Clara office. As part of your role, you will work on designing model architectures, proposing new loss objectives, and improving generative multimodal models to a new level. The ideal candidate will possess a strong background in machine learning and have motivations for developing state‑of‑the‑art models towards AGI.
We encourage you to apply even if you do not believe you meet every single qualification. As long as you are motivated to learn and join the development of foundation models, we’d love to chat.
Responsibilities
Design model architectures and loss objectives to handle combinations of images, video, text, speech, and audio data
Build diverse datasets to support multimodality learning, including data collection and processing
Develop new evaluation pipelines to adapt to various forms of generative outputs
Qualifications
Experience in writing clean and efficient code
Master or Doctoral degree in computer science or equivalent
Proficiency in at least one deep learning framework, such as PyTorch or JAX
Participated in at least one research project related to multimodality learning
Strong candidates may also have
Experience in generic multimodality learning research (e.g., multimodal joint embedding, text‑to‑image generation, text‑to‑video generation, etc.)
Experience in document understanding (e.g., layout analysis, structured data extraction, OCR)
Experience in audio transcribe, diarization, audio generation, etc. Active Github contributions are a big plus
Experience in handling data at billions‑scale
Seniority level Not Applicable
Employment type Full‑time
Job function Engineering and Information Technology
Industries Hospitality, Food and Beverage Services, and Retail
#J-18808-Ljbffr
This range is provided by Boson AI. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.
Base pay range $150,000.00/yr - $600,000.00/yr
Boson AI is an early‑stage startup building large language tools for everyone to use. Our founders (Alex Smola, Mu Li) and a team of Deep Learning, Optimization, NLP, AutoML and Statistics scientists and engineers are working on high‑quality generative AI models for language and beyond.
We are seeking research scientists and engineers to join our team full‑time in our Santa Clara office. As part of your role, you will work on designing model architectures, proposing new loss objectives, and improving generative multimodal models to a new level. The ideal candidate will possess a strong background in machine learning and have motivations for developing state‑of‑the‑art models towards AGI.
We encourage you to apply even if you do not believe you meet every single qualification. As long as you are motivated to learn and join the development of foundation models, we’d love to chat.
Responsibilities
Design model architectures and loss objectives to handle combinations of images, video, text, speech, and audio data
Build diverse datasets to support multimodality learning, including data collection and processing
Develop new evaluation pipelines to adapt to various forms of generative outputs
Qualifications
Experience in writing clean and efficient code
Master or Doctoral degree in computer science or equivalent
Proficiency in at least one deep learning framework, such as PyTorch or JAX
Participated in at least one research project related to multimodality learning
Strong candidates may also have
Experience in generic multimodality learning research (e.g., multimodal joint embedding, text‑to‑image generation, text‑to‑video generation, etc.)
Experience in document understanding (e.g., layout analysis, structured data extraction, OCR)
Experience in audio transcribe, diarization, audio generation, etc. Active Github contributions are a big plus
Experience in handling data at billions‑scale
Seniority level Not Applicable
Employment type Full‑time
Job function Engineering and Information Technology
Industries Hospitality, Food and Beverage Services, and Retail
#J-18808-Ljbffr