Abstract
Speaker localization is one of the active topics in speech processing field. In this paper, we use a two-step method based on Time Difference Of Arrival (TDOA) for the localization of multiple simultaneous speech sources. In this method, directions of speakers are estimated by computing Generalized Cross Correlation (GCC) between microphone signals. In this paper, we propose a method based on combination of subband processing and nested microphone arrays. The use of subband processing is effective in increasing accuracy of multiple speaker localization. Also, the nested array can remove spatial aliasing by intelligent selection of some microphone subsets and assigning them to different subbands. When microphones of each subband were determined, subband processing is just applied on the data from that microphone subset. Moreover, targeting the high-noise environmental conditions, we use the GCC-Maximum Likelihood (GCC-ML) as the localization core of the proposed method. The combination of these all leads to omitting spatial aliasing and increasing the localization accuracy. Simulation results on different environmental scenarios validate the superior performance of the proposed method in the localization of multiple simultaneous speakers.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.