Improved extended HMM composition by incorporating power variance

Y Minami,S Furui

doi:10.1109/icslp.1996.607800

Abstract

The paper describes a way of improving extended HMM composition that can precisely adapt HMMs to both noisy and distorted speech. To do this, the authors incorporate the variance of power into extended HMM composition using quantization to approximate the Gaussian distribution of the 0th order cepstrum. Consequently, a distribution of noisy speech is approximated in the linear spectral domain as a mixture of log normal distributions. This method is evaluated by a four-digit recognition experiment when the number of digits is known. Two types of noise, computer room noise and car noise, are used and noisy and distorted speech data is made by adding these types of noise to speech data recorded using a boundary microphone. Results show that the proposed method improves recognition rates for noisy and distorted speech compared with their previous method.

Full Text