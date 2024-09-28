The Allen Institute for AI launched a family of state-of-the-art multimodal models called Molmo. The Molmo family includes its “most open and powerful multimodal model today, and the most efficient model,” Ai2 said.



Advertisement

Molmo can “understand” a wide range of images, from recognizing everyday objects to reading complex charts and menus. The models can also “point to what they perceive” on screens, such as the visual and interactive features real-world users see.

According to Ai2, Molmo is “creative” and can “brainstorm” designs, and even tell jokes and stories.

Molmo’s training, fine-tuning, and other data is available as open models.

“Multimodal AI models are typically trained on billions of images. We have instead focused on using extremely high quality data but at a scale that is 1000 times smaller,” Ani Kembhavi, senior director of research at Ai2, said in a statement. “This has produced models that are as powerful as the best proprietary systems, but with fewer hallucinations and much faster to train, making our model far more accessible to the community.”