Abstract

In this paper, we present a novel estimator for the SPP at each time-frequency point in the short-time Fourier transform (STFT) domain. Existing speech presence probability (SPP) estimators cannot perform quite reliably in nonstationary noise environment when applied to a speech enhancement task. To overcome this limitation, we propose a novel SPP estimation method. Firstly, the spectral outliers are eliminated by selectively smoothing the maximum likelihood estimate of a priori signal-noise ratio (SNR) in the cepstral domain. Furthermore, an adaptive tracking method for a priori SPP is derived by exploiting the strong correlation of speech presence in neighboring frequency bins of consecutive frames. The proposed approach outperforms the state-of-the-art approaches, resulting in less noise leakage and low speech distortions in both stationary and nonstationary noise environments. Index Terms: speech presence probability, speech enhancement, cepstro-temporal smoothing, time-frequency correlation

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call