India, Jan. 29 -- Alibaba Cloud has unveiled its latest visual-language model, Qwen2.5-VL, which significantly enhances its predecessor, Qwen2-VL. This open-source, multimodal model is offered in various sizes, ranging from 3 billion, 7 billion to 72 billion parameters, and includes both base and instruction-tuned versions.

The flagship model, Qwen2.5-VL-72B-Instruct, is now accessible through the Qwen Chat platform, while the entire Qwen2.5-VL series is available on Hugging Face and Alibaba's open-source community Model Scope.

Qwen2.5-VL demonstrates remarkable multimodal capabilities, excelling in advanced visual comprehension of texts, charts, diagrams, graphics, and layouts within images. It can also understand videos longer than an...