Modulation Spectrum Equalization for Improved Robust Speech Recognition

Liang-Che Sun,Lin-Shan Lee

doi:10.1109/tasl.2011.2166544

Abstract

We propose novel approaches for equalizing the modulation spectrum for robust feature extraction in speech recognition. Common to all approaches in that the temporal trajectories of the feature parameters are first transformed into the magnitude modulation spectrum. In spectral histogram equalization (SHE) and two-band spectral histogram equalization (2B-SHE), we equalize the histogram of the modulation spectrum for each utterance to a reference histogram obtained from clean training data, or perform the equalization with two sub-bands on the modulation spectrum. In magnitude ratio equalization (MRE), we define the magnitude ratio of lower to higher modulation frequency components for each utterance, and equalize this to a reference value obtained from clean training data. These approaches can be viewed as temporal filters that are adapted to each testing utterance. Experiments performed on the Aurora 2 and 4 corpora for small and large vocabulary tasks indicate that significant performance improvements are achievable for all noise conditions. We also show that additional improvements can be obtained when these approaches are integrated with cepstral mean and variance normalization (CMVN), histogram equalization (HEQ), higher order cepstral moment normalization (HOCMN), or the advanced front-end (AFE). We analyze and discuss the reasons for these improvements from different viewpoints with different sets of data, including adaptive temporal filtering, noise behavior on the modulation spectrum, phoneme types, and modulation spectrum distance measures.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Modulation Spectrum Equalization for Improved Robust Speech Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE Transactions on Audio, Speech, and Language Processing	Publication Date: Mar 1, 2012
Citations: 19

Similar Papers

Modulation spectrum equalization for robust speech recognition
Liang-Che Sun ... Chang-Wen Hsu
-
Liang-Che Sun, et. al. Liang-Che Sun ... Chang-Wen Hsu
01 Jan 2007
01 Jan 2007

Improved cepstral mean and variance normalization using Bayesian framework
N Vishnu Prasad ... S Umesh
-
N Vishnu Prasad, et. al.N Vishnu Prasad ... S Umesh
01 Dec 2013
01 Dec 2013

Modified Mean and Variance Normalization: Transforming to Utterance-Specific Estimates
Vikas Joshi ... S Umesh
Circuits, Systems, and Signal Processing | VOL. 35
Vikas Joshi, et. al.Vikas Joshi ... S Umesh
06 Aug 2015
Circuits, Systems, and Signal Processing | VOL. 35

Acoustic feature conversion using a polynomial based feature transferring algorithm
Syu-Siang Wang ... Hsin-Te Hwang
-
Syu-Siang Wang, et. al.Syu-Siang Wang ... Hsin-Te Hwang
01 Sep 2014
01 Sep 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modulation Spectrum Equalization for Improved Robust Speech Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Audio, Speech, and Language Processing