Cepstral noise subtraction for robust automatic speech recognition

Robert Rehr,Timo Gerkmann

doi:10.1109/icassp.2015.7177994

Abstract

The robustness of speech recognizers towards noise can be increased by normalizing the statistical moments of the Mel-frequency cepstral coefficients (MFCCs), e. g. by using cepstral mean normalization (CMN) or cepstral mean and variance normalization (CMVN). The necessary statistics are estimated over a long time window and often, a complete utterance is chosen. Consequently, changes in the background noise can only be tracked to a limited extent which poses a restriction to the performance gain that can be achieved by these techniques. In contrast, algorithms recently developed for single-channel speech enhancement allow to track the background noise quickly. In this paper, we aim at combining speech enhancement techniques and feature normalization methods. For this, we propose to transform an estimate of the noise power spectral density to the MFCC domain, where we subtract it from the noisy MFCCs. This is followed by a conventional CMVN. For background noises that are too instationary for CMVN but can be tracked by the noise estimator, we show that this processing leads to an improvement in comparison to the sole application of CMVN. The observed performance gain emerges especially in low signal-to-noise-ratios.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cepstral noise subtraction for robust automatic speech recognition

Abstract

Talk to us

Similar Papers

More From: Control theory & applications

Lead the way for us

Journal: Control theory & applications	Publication Date: Apr 1, 2015
Citations: 17

Similar Papers

Improved cepstral mean and variance normalization using Bayesian framework
N Vishnu Prasad ... S Umesh
-
N Vishnu Prasad, et. al.N Vishnu Prasad ... S Umesh
01 Dec 2013
01 Dec 2013

강인한 음성인식을 위한 극점 필터링 및 스케일 정규화를 이용한 켑스트럼 특징 정규화 방식
Bo Kyeong Choi ... Sung Min Ban
The Journal of the Acoustical Society of Korea | VOL. 34
Bo Kyeong Choi, et. al.Bo Kyeong Choi ... Sung Min Ban
31 Jul 2015
The Journal of the Acoustical Society of Korea | VOL. 34

Optimizing acoustic features for source cell-phone recognition using speech signals
Cemal Hanilçi ... Figen Ertas
-
Cemal Hanilçi, et. al.Cemal Hanilçi ... Figen Ertas
17 Jun 2013
17 Jun 2013

Robust Front-End Based on MVA and HEQ Post-processing for Arabic Speech Recognition Using Hidden Markov Model Toolkit (HTK)
Elhem Techini ... Medsalim Bouhlel
-
Elhem Techini, et. al.Elhem Techini ... Medsalim Bouhlel
01 Oct 2017
01 Oct 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cepstral noise subtraction for robust automatic speech recognition

Abstract

Talk to us

Similar Papers

More From: Control theory & applications