An energy-constrained signal subspace method for speech enhancement and recognition in white and colored noises

Jun Huang,Yunxin Zhao

doi:10.1016/s0167-6393(98)00041-7

Abstract

In this paper, an energy-constrained signal subspace (ECSS) method is proposed for speech enhancement and automatic speech recognition under additive noise condition. The key idea is to match the short-time energy of the enhanced speech signal to the unbiased estimate of the short-time energy of the clean speech, which is proven very effective for improving the estimation of the noise-like, low-energy segments in continuous speech. The ECSS method is applied to both white and colored noises where the additive colored noise is modelled by an autoregressive (AR) process. A modified covariance method is used to estimate the AR parameters of the colored noise and a prewhitening filter is constructed based on the estimated parameters. The performances of the proposed algorithms were evaluated using the TI46 digit database and the TIMIT continuous speech database. It was found that the ECSS method can achieve very high word recognition accuracy (WRA) for the digits set under low SNR conditions. For continuous speech data set, this method helped to improve the SNR by 2–6 dB and the WRA by 13.7–45.5% for the white noise and 18.6–55.9% for the colored noise under various SNR conditions.

Full Text