Abstract

There is a lack of large-scale speech corpora of Mongolian (a minority language) accessible to researchers for experimental reference, because its users are scattered and it is difficult to collect and label the speech sounds, which hinders the further development of Mongolian speech recognition.Our research group has constructed a speech corpus IMUT-MC for Mongolian speech recognition tasks, which contains about 212 hours of reading speech recorded by 417 speakers, and we are committed to advancing Mongolian speech recognition research. The research group used IMUT-MC to conduct baseline speech recognition experiments on traditional speech recognition models and end-to-end speech recognition models respectively. The speech recognition models based on GMM-HMM, DNN-HMM and Transformer have word error rates on IMUT-MC, respectively. 69.90%, 67.45% and 26.10%, which proves that IMUT-MC is a reliable corpus for Mongolian speech recognition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call