Abstract
In this paper, we address the problem of underdetermined blind source separation (BSS) of anechoic speech mixtures. We propose a demixing algorithm that exploits the sparsity of certain time-frequency expansions of speech signals. Our algorithm merges lscrq -basis-pursuit with ideas based on the degenerate unmixing estimation technique (DUET) [Yiotalmaz and Rickard, Blind Source Separation of Speech Mixtures via Time-Frequency Masking, IEEE Transactions on Signal Processing, vol. 52, no. 7, pp. 1830-1847, July 2004]. There are two main novel components to our approach: 1, our algorithm makes use of all available mixtures in the anechoic scenario where both attenuations and arrival delays between sensors are considered, without imposing any structure on the microphone positions, and 2, we illustrate experimentally that the separation performance is improved when one uses lscrq-basis-pursuit with q < 1 compared to the q = 1 case. Moreover, we provide a probabilistic interpretation of the proposed algorithm that explains why a choice of 0.1 les q les 0.4 is appropriate in the case of speech. Experimental results on both simulated and real data demonstrate significant gains in separation performance when compared to other state-of-the-art BSS algorithms reported in the literature.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.