Abstract

In this paper, we address the problem of underdetermined blind source separation (BSS) of anechoic speech mixtures. We propose a demixing algorithm that exploits the sparsity of certain time-frequency expansions of speech signals. Our algorithm merges lscrq -basis-pursuit with ideas based on the degenerate unmixing estimation technique (DUET) [Yiotalmaz and Rickard, Blind Source Separation of Speech Mixtures via Time-Frequency Masking, IEEE Transactions on Signal Processing, vol. 52, no. 7, pp. 1830-1847, July 2004]. There are two main novel components to our approach: 1, our algorithm makes use of all available mixtures in the anechoic scenario where both attenuations and arrival delays between sensors are considered, without imposing any structure on the microphone positions, and 2, we illustrate experimentally that the separation performance is improved when one uses lscrq-basis-pursuit with q < 1 compared to the q = 1 case. Moreover, we provide a probabilistic interpretation of the proposed algorithm that explains why a choice of 0.1 les q les 0.4 is appropriate in the case of speech. Experimental results on both simulated and real data demonstrate significant gains in separation performance when compared to other state-of-the-art BSS algorithms reported in the literature.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call