Performances comparison between Improved DHMM and Gaussian Mixture HMM for speech recognition

Shing-Tai Pan,Yi-Heng Tsai,Ching-Fa Chen,Wei-Der Chang

doi:10.1109/cisp.2011.6100771

Abstract

This paper compares the performances, recognition rate and computation speed, between an Improved Discrete Hidden Markov Model (DHMM) and Gaussian Mixture Hidden Markov Model (GMHMM) for Mandarin speech recognition. The fuzzy vector quantization (FVQ) is used to improve the modeling of DHMM for the speech recognition. A codebook for DHMM will be first trained by K-means algorithms using Mandarin training speech feature. Then, based on the trained codebook, the speech features are quantized by the fuzzy sets and then are statistically applied to train the model of DHMM. Experimental results in this paper will show that the speech recognition rate can be improved by using FVQ algorithm to train the model of DHMM. The recognition rate by using an improved DHMM is only a little bit less than that by using GMHMM. However, the computation time for speech recognition by using improved DHMM is much less than that by using GMHMM. These results reveal that the improved DHMM is more suitable to real-time applications than GMHMM.

Full Text