HMM-Based Uyghur Continuous Speech Recognition System

Wushour Silamu,Nasirjan Tursun

doi:10.1109/csie.2009.717

Abstract

In this work presents a continuous speech recognition system for Uyghur language based on HMM, which called UASRS. Uyghur language is an agglutinative language and one of the least studied languages on speech recognition area. So, our first work was building a Uyghur continuous speech database. In acoustic level, we was using the common used HMM (hidden Markov model) for modeling the Uyghur speech data; in language level, modeling the Uyghur text data based on N-Gram language model. At last we were using the recognizer of HTK3.3 (HMM toolkit) and the MS Visual C + + 8.0 developing the Uyghur continuous speech recognition system. In this paper also presents the recognition experiments of Uyghur continuous speech by using the UASRS. The recognition rate was 68.98% (sentences), and 94.65% (words) for the test set. The recognition rate was 51.49% (sentences), and 85.82% (words) for the real-time speech recognition.

Full Text