Joint Denoising and Dereverberation Using Exemplar-Based Sparse Representations and Decaying Norm Constraint

Deepak Baby,Hugo Van Hamme

doi:10.1109/taslp.2017.2744261

Abstract

Exemplar-based nonnegative models, where the noisy speech is decomposed as a sparse nonnegative linear combination of the speech and noise exemplars stored in a dictionary, have been successfully used for speech denoising. This paper extends this technique for the single-channel speech enhancement in noisy reverberant environments using a novel approximation of the noisy reverberant speech in the frequency domain and nonnegative matrix deconvolution. In the proposed model, the room impulse response RIR in the magnitude short-time Fourier transform domain is defined such that its decaying structure can also be estimated from the test data itself, whereas the existing models used a suboptimal binwise clamping procedure to impose such a decaying structure that does not hold in a typical RIR. This paper presents multiplicative updates for estimating the RIR, its decay, and the underlying anechoic speech and noise. The proposed model is evaluated on a synthetically created dataset created by convolving TIMIT recordings with RIRs measured from different rooms and varying speaker-and-microphone locations, and adding background noises taken from the CHiME corpus. Simulation results show that the proposed model results in a better RIR estimate over the existing model and improves various instrumental speech quality measures.

Full Text