A time-frequency blind separation method for underdetermined speech mixtures

Yao Lv,Shuangtian Li

doi:10.1007/s11767-008-0011-1

Abstract

The proposed Blind Source Separation method (BSS), based on sparse representations, fuses time-frequency analysis and the clustering approach to separate underdetermined speech mixtures in the anechoic case regardless of the number of sources. The method remedies the insufficiency of the Degenerate Unmixing Estimation Technique (DUET) which assumes the number of sources a priori. In the proposed algorithm, the Short-Time Fourier Transform (STFT) is used to obtain the sparse representations, a clustering method called Unsupervised Robust C-Prototypes (URCP) which can accurately identify multiple clusters regardless of the number of them is adopted to replace the histogram-based technique in DUET, and the binary time-frequency masks are constructed to separate the mixtures. Experimental results indicate that the proposed method results in a substantial increase in the average Signal-to-Interference Ratio (SIR), and maintains good speech quality in the separation results.

Full Text