Variational noise model composition through model perturbation for robust speech recognition with time-varying background noise

Wooil Kim,John H.L Hansen

doi:10.1016/j.specom.2010.12.001

Abstract

This study proposes a novel model composition method to improve speech recognition performance in time-varying background noise conditions. It is suggested that each element of the cepstral coefficients represents the frequency degree of the changing components in the envelope of the log-spectrum. With this motivation, in the proposed method, variational noise models are formulated by selectively applying perturbation factors to the mean parameters of a basis model, resulting in a collection of noise models that more accurately reflect the natural range of spectral patterns seen in the log-spectral domain. The basis noise model is obtained from the silence segments of the input speech. The perturbation factors are designed separately for changes in the energy level and spectral envelope. The proposed variational model composition (VMC) method is employed to generate multiple environmental models for our previously proposed parallel combined gaussian mixture model (PCGMM) based feature compensation algorithm. The mixture sharing technique is integrated to reduce computational expenses, caused by employing the variational models. Experimental results prove that the proposed method is considerably more effective at increasing speech recognition performance in time-varying background noise conditions, with +31.31%, +10.65%, and +20.54% average relative improvements in word error rate for speech babble, background music, and real-life in-vehicle noise conditions respectively, compared to the original basic PCGMM method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Variational noise model composition through model perturbation for robust speech recognition with time-varying background noise

Abstract

Talk to us

Similar Papers

More From: Speech Communication

Lead the way for us

Journal: Speech Communication	Publication Date: Dec 28, 2010
Citations: 20

Similar Papers

Feature Compensation Employing Variational Model Composition for Robust Speech Recognition in In-Vehicle Environment
Wooil Kim ... John H. L. Hansen
-
Wooil Kim, et. al.Wooil Kim ... John H. L. Hansen
05 Nov 2011
05 Nov 2011

Mask estimation employing Posterior-based Representative Mean for missing-feature speech recognition with time-varying background noise
Wooil Kim ... John H.L Hansen
-
Wooil Kim, et. al.Wooil Kim ... John H.L Hansen
01 Dec 2009
01 Dec 2009

Missing-Feature Reconstruction by Leveraging Temporal Spectral Correlation for Robust Speech Recognition in Background Noise Conditions
Wooil Kim ... J Hansen
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 18
Wooil Kim, et. al. Wooil Kim ... J Hansen
01 Nov 2010
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 18

A Novel Mask Estimation Method Employing Posterior-Based Representative Mean Estimate for Missing-Feature Speech Recognition
Wooil Kim ... John H L Hansen
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 19
Wooil Kim, et. al. Wooil Kim ... John H L Hansen
01 Jul 2011
IEEE Transactions on Audio, Speech, and Language Processing | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Variational noise model composition through model perturbation for robust speech recognition with time-varying background noise

Abstract

Talk to us

Similar Papers

More From: Speech Communication