Increasing the mixture components of non-uniform HMM structures based on a variational Bayesian approach

Takatoshi Jitsuhiro,Satoshi Nakamura

doi:10.21437/interspeech.2004-274

Abstract

We propose using the Variational Bayesian (VB) approach for automatically creating non-uniform, context-dependent HMM topologies. Although the Maximum Likelihood (ML) criterion is generally used to create HMM topologies, it has an overfitting problem. Recently, to avoid this problem, the VB approach has been applied to create acoustic models for speech recognition. We introduce the VB approach to the Successive State Splitting (SSS) algorithm, which can create both contextual and temporal variations for HMMs. Experimental results show that the proposed method can automatically create a more efficient model than the original method. Furthermore, we evaluated a method to increase the number of mixture components by using the VB approach and considering temporal structures. The VB approach obtained the best performance with a smaller number of mixture components in comparison with that obtained by using ML based methods.

Full Text