India, Jan. 29 -- Alibaba Cloud has unveiled its latest visual-language model, Qwen2.5-VL, which significantly enhances its predecessor, Qwen2-VL. This open-source, multimodal model is offered in various sizes, ranging from 3 billion, 7 billion to 72 billion parameters, and includes both base and instruction-tuned versions.
The flagship model, Qwen2.5-VL-72B-Instruct, is now accessible through the Qwen Chat platform, while the entire Qwen2.5-VL series is available on Hugging Face and Alibaba's open-source community Model Scope.
Qwen2.5-VL demonstrates remarkable multimodal capabilities, excelling in advanced visual comprehension of texts, charts, diagrams, graphics, and layouts within images. It can also understand videos longer than an...
Click here to read full article from source
To read the full article or to get the complete feed from this publication, please
Contact Us.