Abstract
Direction of arrival (DOA) estimation is the key to many audio applications. Recently, sparse component analysis (SCA)-based methods have attracted much attention, in which single-source points (SSPs) and single-source zones (SSZs) where one source is dominant over the others in time-frequency domain are usually detected to construct the pooled histogram containing multi-source DOA information. Nonetheless, the SSZ size in existing methods is fixed and empirically predetermined, which cannot accommodate to the varying spectro-temporal property of speech sources. Furthermore, higher SSP concentration in a SSZ implies a locally stronger dominant source as well as more reliable DOA information extracted therein, which however is also not taken into account yet. To address these problems, a DOA estimation algorithm for multiple speech sources based on flexible SSZs and concentration weighting is presented in this paper. First, in each frame, correlation coefficients of time delay vectors across adjacent frequency bins are calculated to identify SSPs, followed by flexible SSZs construction using varying number of SSPs located at consecutive frequency bins. Next, the number of SSPs in each flexible SSZ is considered as a proxy of corresponding concentration degree, and employed as weighting factor to form the pooled histogram. Finally, a matching pursuit (MP)-based approach is utilized to obtain multi-source DOA estimates. Simulation results reveal that the proposed method significantly outperforms existing approaches in terms of noise floor in pooled histogram, angular resolution, and performance under various signal-to-noise ratio and reverberant conditions. Real-world experiments also verify its effectiveness, and meanwhile demonstrate considerably reduced computational complexity.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.