Colored Noise Based Multicondition Training Technique for Robust Speaker Identification

L Zao,R Coelho

doi:10.1109/lsp.2011.2169453

Abstract

This letter proposes a colored noise based multicondition training technique for robust speaker identification in unknown noisy environments. The colored noise samples generation is based on filtering a white Gaussian sequence that leads to a power spectral density (PSD) proportional to 1/ <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">f</i> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">β</sup> , where β ∈ [0, 2]. Gaussian mixture models (GMM) are applied to obtain the speaker models using the noisy speech signals with a single signal-to-noise ratio (SNR). The colored noise based multicondition training is evaluated for the speaker identification task considering the test utterances corrupted with real acoustic noises and different values of SNR. The results show that the proposed technique outperforms the white noise based multicondition and the clean-speech training approaches.

Full Text