N-Best Re-scoring Approaches for Mandarin Speech Recognition

Xinxin Li,Xuan Wang,Jian Guan

doi:10.14257/ijhit.2014.7.2.26

Abstract

The predominant language model for speech recognition is n-gram language model, which is locally learned and usually lacks global linguistic information such as long-distance syntactic constraints. We first explore two n-best re-scoring approaches for Mandarin speech recognition to overcome this problem. The first approach is linear re-scoring that can combine several language models from various perspectives. The weights of these models are optimized using minimum error rate learning method. Discriminative approach can also be used for re-scoring with rich syntactic features. To overcome the speech text insufficiency problem for discriminative model, we propose a domain adaptation method that trains the model using Chinese pinyin-to-character conversion dataset. Then we present a cascaded approach to combine the two re-scoring models in pipeline that takes the probability output of linear re-scoring model as the initial weight of the discriminative model. Experimental results show that both re-scoring approaches outperform the baseline system, and the cascaded approach achieves the best performance.

Full Text