MMSE and maximum a posteriori estimators for speech enhancement in additive noise assuming a t ‐location‐scale clean speech prior

Neda Faraji,Akram Kohansal

doi:10.1049/iet-spr.2017.0446

Abstract

The authors derive closed form solutions for the minimum mean square error (MMSE) and maximum a posteriori estimators for speech enhancement in additive Gaussian noise assuming a t -location-scale probability density function (PDF) as clean speech prior. Fitting a t -location-scale PDF to the real and imaginary parts of the discrete fourier transform (DFT) coefficients of clean speech signals demonstrates the lower Jensen–Shannon divergence compared to the other heavy-tailed distributions such as Laplacian and gamma. The authors utilise the two presented estimators along with the Wiener filter and MMSE estimators based on Laplacian, gamma, and generalised gamma prior PDFs to enhance noisy signals from the NOIZEUS database. All the estimators are compared together in terms of both signal and noise distortions. The obtained results show that their proposed MMSE estimator results in the minimum squared error and signal distortion to estimate the complex-valued DFT coefficients of speech. The quality assessments of the enhanced signals are also performed in terms of perceptual evaluation of speech quality, segmental and general SNRs.

Full Text