Abstract

In this paper, an energy-constrained signal subspace (ECSS) method is proposed for speech enhancement and automatic speech recognition under additive noise condition. The key idea is to match the short-time energy of the enhanced speech signal to the unbiased estimate of the short-time energy of the clean speech, which is proven very effective for improving the estimation of the noise-like, low-energy segments in continuous speech. The ECSS method is applied to both white and colored noises where the additive colored noise is modelled by an autoregressive (AR) process. A modified covariance method is used to estimate the AR parameters of the colored noise and a prewhitening filter is constructed based on the estimated parameters. The performances of the proposed algorithms were evaluated using the TI46 digit database and the TIMIT continuous speech database. It was found that the ECSS method can achieve very high word recognition accuracy (WRA) for the digits set under low SNR conditions. For continuous speech data set, this method helped to improve the SNR by 2–6 dB and the WRA by 13.7–45.5% for the white noise and 18.6–55.9% for the colored noise under various SNR conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call