Nigeria, Feb. 3 -- The dataset, which took over three years to develop, aims to address the shortage of high-quality speech data that has limited the use of voice-based technologies across much of Sub-Saharan Africa.
WAXAL contains speech data for 21 African languages, including Hausa, Yoruba, Igbo, Swahili, Luganda, and Acholi. According to Google, the dataset is intended to support more than 100 million speakers whose languages are largely absent from existing speech recognition and voice synthesis systems.
The dataset includes more than 11,000 hours of speech recordings drawn from nearly two million individual audio samples. Of this total, about 1,250 hours are fully transcribed natural speech, which can be used to train automatic s...
Click here to read full article from source
To read the full article or to get the complete feed from this publication, please
Contact Us.