Chapter 4 - Unsupervised single-channel speech enhancement based on phase aware time-frequency mask estimation

Nasir Saleem,Muhammad Irfan Khattak

doi:10.1016/b978-01-2-823898-1.00006-0

Nasir Saleem, Muhammad Irfan Khattak

https://doi.org/10.1016/b978-01-2-823898-1.00006-0

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Time-frequency masking combined with spectral phase estimation may be useful for recovering the intelligibility and quality of speech degraded by background noises. This unsupervised speech enhancement method has an immense ability to decrease the noise in nonstationary and difficult noisy backgrounds. The proposed method replaces the spectral phase of noisy speech with an estimated spectral phase and merges with the novel time-frequency mask during signal reconstruction. Variance-based features are extracted to estimate the time-frequency mask and are then passed over an unsupervised and nonparametric adaptive threshold. The extracted features satisfying the threshold condition are retained, whereas the violating features are discarded. The estimated time-frequency mask is used to obtain enhanced speech. During phase estimation for signal reconstruction, the noisy phase is decomposed into the spectrum of the instantaneous noisy phase trailed by temporal smoothing to decrease variations. Results show considerable improvements in terms of short-time objective intelligibility (STOI), perceptual evaluation of speech quality (PESQ), segmental signal-to-noise ratio (SSNR), and speech distortion.

Full Text