Thai speech enhancement using Markov random field

T Saimai,C Tantibundhit,N Thatphithakkul,C Onsuwan

doi:10.1109/ecticon.2012.6254290

Abstract

This paper presents a speech enhancement algorithm for Thai speech corrupted by noise. A Markov random field (MRF) is used to remove noise from the noisy speech. Specifically, the noisy speech is transformed using a short time Fourier transform (STFT) resulting in time-frequency representation coefficients. Then, it is fed into a voiced/unvoiced classification by using voice activity detector (VAD) for each time frame to recover a formant of voiced-time frame. The next step is to find a neighborhood of every STFT coefficient. The MRF model, which is composed of two states: speech and noise, is used to analyze those STFT coefficients. All of the speech states (significant coefficients) are retained, while the STFT coefficients of the noise state are set to zero. The enhanced speech is estimated by the inverse STFT of the significant coefficients. Experimental results from a perception test on speech corrupted with white Gaussian noise at SNR levels of 0 and 5 dB showed that the proposed algorithm could remove noise effectively.

Full Text