Unsupervised speech enhancement in low SNR environments via sparseness and temporal gradient regularization

Nasir Saleem,Muhammad Irfan Khattak,Muhammad Shafi

doi:10.1016/j.apacoust.2018.07.027

Abstract

A crucial stage in unsupervised speech enhancement algorithm is the estimation of noise related parameters which usually needs prior models for noise. However, estimation of such parameters is a challenging task at low signal-to-noise ratios or in nonstationary noisy environments. In this paper, without knowing the prior models, an unsupervised and iterative speech enhancement algorithm is proposed which assumes speech spectrogram and its temporal gradient as sparse components. The quasi-harmonic description of the speech signals justifies this assumption. The speech enhancement is performed by decomposing the spectrogram of noisy speech into sparse matrix, enforcing the sparsity and temporal gradient regularizations. The Kullback–Leibler divergence is incorporated to minimize the distance between the observation and reconstructed components with nonnegativity constraints. Alternating direction method of multipliers is used to optimize the algorithm. The proposed algorithm is different from many speech enhancement approaches as it reduces background noise in an uncomplicated manner without need of a noise estimation algorithm to find noise-only excerpt. In addition, the proposed algorithm obtains an improved performance in adverse environments without knowing the exact distribution of noise. The experimental results demonstrate that the proposed algorithm outperforms the competing algorithms in terms of the speech quality and intelligibility. Moreover, the composite objective measure reinforced better performance in terms of residual noise and speech distortion in strong noise.

Full Text