Speech Enhancement and noise reduction have wide applications in speech processing. They are often employed as pre-processing stage in various applications. The work to be presented in this paper is denoising a single-channel speech signal at the presence of a highly non-stationary background noise in order to improve the perceptible quality and intelligibility of the speech. Real world noise is mostly highly non-stationary and does not affect the speech signal uniformly over the spectrum. This paper investigates various Discrete Fourier Transform-based algorithms as single-channel pre-processing techniques consisting of: Spectral Subtraction using over-subtraction and spectral floor, Multi-Band Spectral Subtraction (MBSS), Wiener Filter, MMSE of Short-Time Spectral Amplitude (MMSE-STSA) estimator with, and without using SPU modifier, MMSE Log-Spectral Amplitude Estimator with, and without using SPU modifier, Optimally-Modified Log-Spectral Amplitude estimator (OM-LSA). The processed speeches from these algorithms are compared at the same set of conditions using visual examinations of signals in the time domain and the spectrograms, and also the objective and subjective tests for quality and perceptual evaluation. All the implemented algorithms provide considerable, different degrees of flexibility and control on noise elimination levels that reduces artifacts in the enhanced speech, resulting in the improved quality, and intelligibility.
Read full abstract