Mixture HMMs Research Articles

To improve recognition performance in noisy environments, multicondition training is usually applied in which speech signals corrupted by a variety of noise are used in acoustic model training. Published hidden Markov modeling of speech uses multiple Gaussian distributions to cover the spread of the speech distribution caused by noise, which distracts the modeling of speech event itself and possibly sacrifices the performance on clean speech. In this paper, we propose a novel approach which extends the conventional Gaussian mixture hidden Markov model (GMHMM) by modeling state emission parameters (mean and variance) as a polynomial function of a continuous environment-dependent variable. At the recognition time, a set of HMMs specific to the given value of the environment variable is instantiated and used for recognition. The maximum-likelihood (ML) estimation of the polynomial functions of the proposed variable-parameter GMHMM is given within the expectation-maximization (EM) framework. Experiments on the Aurora 2 database show significant improvements of the variable-parameter Gaussian mixture HMMs compared to the conventional GMHMMs

AbstractTree‐based clustering is an effective method for sharing the state of an HMM in which clustering is applied to a set of context‐dependent models with the phoneme context as the splitting condition. In past papers, the method has been restricted to the single Gaussian HMM. The single Gaussian HMM, however, is insufficient for representing the acoustic features, and an adequate topology (sharing of HMM state) will not necessarily be realized. Furthermore, in order to arrive at a state‐sharing model with the desired number of mixtures, the process of doubling the number of mixtures and the embedded training must be iterated after the tree‐based clustering, which increases the time for training. Consequently, this paper proposes a method in which the tree‐based clustering algorithm for the single Gaussian HMM is extended to the clustering of the mixed Gaussian HMM. The proposed method reduces the training time to approximately one‐third that of the conventional method of handling the single Gaussian HMM. A recognition experiment using a phone typewriter and a recognition experiment for continuous word demonstrate that the recognition rate is improved by one to two points. © 2002 Wiley Periodicals, Inc. Syst Comp Jpn, 33(4): 40–49, 2002; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.1118

Mixture HMMs Research Articles

Articles published on Mixture HMMs

Speaker and Channel Factors in Text-Dependent Speaker Recognition

Facial expression recognition based on combined HMM

A Study of Variable-Parameter Gaussian Mixture Hidden Markov Modeling for Noisy Speech Recognition

Estimating high‐confidence portions based on agreement among outputs of multiple LVCSR models

Tree‐based clustering for gaussian mixture HMMs

Phonemic hidden Markov models with continuous mixture output densities for large vocabulary word recognition

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Mixture HMMs Research Articles

Articles published on Mixture HMMs

Speaker and Channel Factors in Text-Dependent Speaker Recognition

Facial expression recognition based on combined HMM

A Study of Variable-Parameter Gaussian Mixture Hidden Markov Modeling for Noisy Speech Recognition

Estimating high‐confidence portions based on agreement among outputs of multiple LVCSR models

Tree‐based clustering for gaussian mixture HMMs

Phonemic hidden Markov models with continuous mixture output densities for large vocabulary word recognition