&lt;title&gt;Combining LVQ with continuous-density hidden Markov models in speech recognition&lt;/title&gt;

Mikko Kurimo,Kari Torkkola

doi:10.1117/12.130880

<title>Combining LVQ with continuous-density hidden Markov models in speech recognition</title>

Mikko Kurimo, Kari Torkkola

https://doi.org/10.1117/12.130880

Copy DOI

Export

Save

Cite

Publication Date: Dec 16, 1992

Citations: 4

Affiliation: Idiap Research Institute

Abstract
Full-Text
Similar Papers

Abstract

Listen

We propose the use of self-organizing maps (SOMs) and learning vector quantization (LVQ) as an initialization method for the training of the continuous observation density hidden Markov models (CDHMMs). We apply CDHMMs to model phonemes in the transcription of speech into phoneme sequences. The Baum-Welch maximum likelihood estimation method is very sensitive to the initial parameter values if the observation densities are represented by mixtures of many Gaussian density functions. We suggest the training of CDHMMs to be done in two phases. First the vector quantization methods are applied to find suitable placements for the means of Gaussian density functions to represent the observed training data. The maximum likelihood estimation is then used to find the mixture weights and state transition probabilities and to re-estimate the Gaussians to get the best possible models. The result of initializing the means of distributions by SOMs or LVQ is that good recognition results can be achieved using essentially fewer Baum-Welch iterations than are needed with random initial values. Also, in the segmental K-means algorithm the number of iterations can be remarkably reduced with a suitable initialization. We experiment, furthermore, to enhance the discriminatory power of the phoneme models by adaptively training the state output distributions using the LVQ-algorithm.© (1992) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Full Text