Abstract
This paper addresses the problem of robust speech recognition in noisy conditions in the framework of hidden Markov models (HMMs) and missing feature imputation techniques. It presents a new statistical approach to the detection and estimation of unreliable features based on a probabilistic measure and Gaussian mixture model (GMM) representing clean speech distribution. In the estimation process, the GMM is compensated using parameters of the statistical model of additive background noise. The GMM means are used to estimate the clean speech features. The GMM imputed values and the noisy signal are combined proportionally to a probabilistic reliability measure to estimate the clean speech. The reliability measure allows us to avoid to use a hard decision threshold to decompose the data into reliable and unreliable features and consequently reduces the risk of missclassification. GMM based technique is less complex than the corresponding HMM based estimation and gives similar improvement in the recognition performance. Once unreliable features are replaced by the estimated clean speech features, the entire set of spectral features is transformed to the MFCC (Mel Frequency Cepstral Coefficient) feature domain. The MFCCs which are characterized by a higher baseline recognition rate are used for final recognition using continuous density hidden Markov models (CDHMMs) with diagonal covariance matrices.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.