Intra-frame cepstral sub-band weighting and histogram equalization for noise-robust speech recognition

Jeih-Weih Hung,Hao-Teng Fan

doi:10.1186/1687-4722-2013-29

Jeih-Weih Hung, Hao-Teng Fan

Open Access

https://doi.org/10.1186/1687-4722-2013-29

Copy DOI

Abstract

In this paper, we propose a novel noise-robustness method known as weighted sub-band histogram equalization (WS-HEQ) to improve speech recognition accuracy in noise-corrupted environments. Considering the observations that high- and low-pass portions of the intra-frame cepstral features possess unequal importance for noise-corrupted speech recognition, WS-HEQ is intended to reduce the high-pass components of the cepstral features. Furthermore, we provide four types of WS-HEQ, which partially refers to the structure of spatial histogram equalization (S-HEQ). In the experiments conducted on the Aurora-2 noisy-digit database, the presented WS-HEQ yields significant recognition improvements relative to the Mel-scaled filter-bank cepstral coefficient (MFCC) baseline and to cepstral histogram normalization (CHN) in various noise-corrupted situations and exhibits a behavior superior to that of S-HEQ in most cases.

Highlights

The performance of speech recognition systems is often degraded due to noise in application environments
The work in [11] revealed that in the cepstral histogram normalization (CHN) method, even though each cepstral channel is processed by histogram equalization (HEQ), a significant histogram mismatch still exists among the training and testing cepstral features for the low-pass filtered (LPF) and highpass filtered (HPF) portions of the intra-frame cepstra
5.1 Recognition accuracy The presented weighted sub-band histogram equalization (WS-HEQ) is evaluated in terms of recognition accuracy

Summary

Introduction

The performance of speech recognition systems is often degraded due to noise in application environments. Typical examples are perceptual masking [1], empirical mode decomposition [2], optimally modified log-spectral amplitude estimation [3], wavelet packet decomposition with AR modeling [4], cepstral mean and variance normalization (MVN) [5], cepstral histogram normalization (CHN) [6,7], MVN with ARMA filtering (MVA) [8], higher order cepstral moment normalization (HOCMN) [9], and temporal structure normalization (TSN) [10] In some of these methods, the compensation is performed on each individual cepstral channel sequence of an utterance by assuming that these channels are mostly uncorrelated [7]. We change the order of the procedures in S-HEQ by first splitting the original intra-frame cepstra (not the CHN-preprocessed cepstra) into LPF and HPF, subsequently compensating LPF and HPF individually, and normalizing the full-band cepstra This new structure can reduce the effect of noise on the LPF and HPF portions in the plain cepstra more directly in comparison with S-HEQ.

Proposed approach

Experimental setup

Method

Experimental results and discussions for the Aurora-2 task

The experiment on the TCC-300 Mandarin dataset

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EURASIP Journal on Audio, Speech, and Music Processing	Publication Date: Dec 1, 2013
Citations: 9	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Intra-frame cepstral sub-band weighting and histogram equalization for noise-robust speech recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing

Lead the way for us

Similar Papers

Stereo-based histogram equalization for robust speech recognition
Randa Al-Wakeel ... Magdy Aboul-Ela
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2015
Randa Al-Wakeel, et. al.Randa Al-Wakeel ... Magdy Aboul-Ela
09 Jun 2015
EURASIP Journal on Audio, Speech, and Music Processing | VOL. 2015

Sub-band level histogram equalization for robust speech recognition
Vikas Joshi ... C Benitez
-
Vikas Joshi, et. al.Vikas Joshi ... C Benitez
27 Aug 2011
27 Aug 2011

Histogram equalization with Bayesian estimation for noise robust speech recognition.
Youngjoo Suh ... Hoirin Kim
The Journal of the Acoustical Society of America | VOL. 143
Youngjoo Suh, et. al.Youngjoo Suh ... Hoirin Kim
01 Feb 2018
The Journal of the Acoustical Society of America | VOL. 143

Using Auxiliary Sources of Knowledge for Automatic Speech Recognition

-

01 Jan 2004
01 Jan 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Intra-frame cepstral sub-band weighting and histogram equalization for noise-robust speech recognition

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing