Abstract

This paper addresses the problem of robust speech recognition in noisy conditions in the framework of hidden Markov models (HMMs) and missing feature imputation techniques. It presents a new statistical approach to the detection and estimation of unreliable features based on a probabilistic measure and Gaussian mixture model (GMM) representing clean speech distribution. In the estimation process, the GMM is compensated using parameters of the statistical model of additive background noise. The GMM means are used to estimate the clean speech features. The GMM imputed values and the noisy signal are combined proportionally to a probabilistic reliability measure to estimate the clean speech. The reliability measure allows us to avoid to use a hard decision threshold to decompose the data into reliable and unreliable features and consequently reduces the risk of missclassification. GMM based technique is less complex than the corresponding HMM based estimation and gives similar improvement in the recognition performance. Once unreliable features are replaced by the estimated clean speech features, the entire set of spectral features is transformed to the MFCC (Mel Frequency Cepstral Coefficient) feature domain. The MFCCs which are characterized by a higher baseline recognition rate are used for final recognition using continuous density hidden Markov models (CDHMMs) with diagonal covariance matrices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call