Abstract

In this paper, STFT based speech enhancement algorithms based on estimation of short time spectral amplitudes are proposed. These algorithms use Maximum Likelihood (ML), Maximum a posterior (MAP) and Minimum mean square error (MMSE) estimators which respectively uses Laplace, Gaussian probability density functions (pdf) as noise spectral amplitude priors and Nakagami, Gamma distributions as speech spectral amplitude priors. The method uses a joint MMSE estimate of the clean speech amplitude and clean speech phase for a given uncertainty phase information for improved single channel speech enhancement. In the most of the speech enhancement algorithms, we only concentrate on the frequency domain amplitude of speech, but not on the phase of noisy speech since it may cause undesired artifacts. In this paper, a recent phase reconstruction algorithm is used to estimate the phase of clean speech. The reconstructed phase is treated as an uncertain prior knowledge when deriving a joint MMSE estimate of the Complex speech coefficients given Uncertain Phase (CUP) information. The proposed MMSE optimal CUP estimator reduces undesired artifacts and also gives satisfactory values between the phase of noisy signal and the estimate of prior phase. We evaluate all the above estimators using speech signals uttered by 10 male speakers and 10 female speakers are taken from TIMIT database. The proposed method outperforms other benchmark algorithms in terms of segmental signal to noise ratio (SSNR), short-time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call