Abstract
The author presents a study of large-vocabulary continuous Mandarin speech recognition based on a segmental probability model (SPM) approach. SPM was found to be very suitable for recognition of isolated Mandarin syllables especially considering the monosyllabic structure of Chinese language. To extend the application of the model to continuous Mandarin speech recognition, a concatenated syllable matching (CSM) algorithm in place of the conventional Viterbi search algorithm is first introduced. Also, to utilise the available training material efficiently, a training procedure is proposed to re-estimate the SPM parameters using the maximum a posteriori (MAP) algorithm. A few special techniques integrating acoustic and linguistic knowledge are developed further to improve the performance step by step. Preliminary experimental results show that the final achievable rate is as high as 91.62%, which indicates a 18.48% error rate reduction and more than three times faster than the well studied subsyllable-based CHMM.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have