Abstract

To increase the performance of the communication systems and other applications related to speech or speaker recognition, it is usual to incorporate a preprocessing stage for speech enhancement before it enters a specific system. Although the signal subspace approach is very useful, it becomes less effective when the signal to noise ratio is lower than 10 dB. This is due to possible inaccuracies in estimating the noise, whose action on the different phonemes and frequency bands is far from uniform, even considering white noise. A critical point is estimating the signal subspace dimension. It not only depends on the variance the noise level but also on SNR, too, which varies according to the different temporal segments of speech as well as the frequency band. Then it is proposed to work on temporal frames in each critical band using as the reference the noise variance affected by a factor which depends on the particular SNR segment being considered to estimate the signal dimension Thus weak signal components are preserved which could be eliminated otherwise. The noisy signal projection on the signal subspace can be regarded as a first stage in the speech enhancement process that has to be complement with additional techniques.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call