Source counting in speech mixtures using a variational EM approach for complex WATSON mixture models

Lukas Drude,Dang Hai Tran Vu,Aleksej Chinaev,Reinhold Haeb-Umbach

doi:10.1109/icassp.2014.6854924

Abstract

In this contribution we derive a variational EM (VEM) algorithm for model selection in complex Watson mixture models, which have been recently proposed as a model of the distribution of normalized microphone array signals in the short-time Fourier transform domain. The VEM algorithm is applied to count the number of active sources in a speech mixture by iteratively estimating the mode vectors of the Watson distributions and suppressing the signals from the corresponding directions. A key theoretical contribution is the derivation of the MMSE estimate of a quadratic form involving the mode vector of the Watson distribution. The experimental results demonstrate the effectiveness of the source counting approach at moderately low SNR. It is further shown that the VEM algorithm is more robust with respect to used threshold values.

Full Text