Abstract

Because the minority languages in China have their special characteristics,it is not suitable to directly adopt the traditional automatic speech recognition (ASR) methods which are used for some major languages,such as Chinese,English,Japanese,etc.In this paper,we take Mongolian (a resource-deficient language) as an example and build the acoustic and language models for applying the ATRASR system.In this paper,we specially focus on the language modeling aspect by considering the special characteristics of the Mongolian.We trained a multi-class N-gram language model based on similar word clustering.By applying the proposed language model,the system could improve the performance by 5.5% compared with the conventional word N-gram.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call