Abstract

Signal subspace has been widely exploited to localize multiple speech sources. However, most signal subspace methods cannot count the number of sources, and do not make use of speech sparsity in the frequency domain. This paper presents a grid search window-dominant signal subspace GS-WDSS method and a closed-form WDSS CF-WDSS method to localize short-term speech sources. Such methods are based upon the generalized sparsity assumption that each window containing some time-adjacent bins is dominated by one source, as opposed to the conventional assumption that each individual bin is dominated by one source. The generalized assumption enables the principal eigenvector of the spatial correlation matrix on each window to span the signal subspace of the window-dominant source. The direction-of-arrival DOA of the dominant source is estimated from the principal eigenvector. The DOAs and the number of sources are eventually summarized from the DOA histogram of all dominant sources. The conventional assumption is a special case of the generalized assumption. By using the generalized assumption, the performance in estimating DOAs of the window-dominant sources is significantly improved at the cost of acceptable masking effect. The superiority of the proposed methods is verified by simulated and real experiments.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.