In this paper, STFT based speech enhancement algorithms based on estimation of short time spectral amplitudes are proposed. These algorithms use maximum likelihood, maximum a posterior and minimum mean square error (MMSE) estimators which respectively uses Laplace, Gamma and Exponential probability density functions as noise spectral amplitude priors and Nakagami distribution as speech spectral amplitude priors. The phase of noisy speech carries significant information to be retrieved and utilized. However, the undesired artifacts which are the resultant of the process do create many challenges. In this paper, the reconstructed phase is treated as an uncertain prior knowledge when deriving a joint MMSE estimate of the (C)omplex speech coefficients given (U)ncertain (P)hase information is proposed. The proposed phase reconstruction algorithm assists in generating a clean speech phase. The proposed estimator reduces undesired artifacts and also gives satisfactory values between noisy phase signal and estimate of prior phase and hence yields superior performance in the instrument measures, informal listening and speech quality.
Read full abstract