Fast multiple moving sound sources localization utilizing sparseness of speech signals

Eiji Sato,Yosuke Tatekura

doi:10.1121/1.4969526

Abstract

We propose multiple moving sound sources localization with reduced operation time utilizing the sparseness of speech signals, and we demonstrate the efficacy of proposed method in actual environment experiments. Sound source localization is an essential function for tasks such as robot audition, and particularly for sound sources that often move in the actual environments. Sound source localization requires high resolution and real-time processing because they are utilized for post-processes such as sound source separation. The proposed method is based on MUSIC (MUltiple SIgnal Classification) which is known as a sound source localization method with high resolution. However, MUSIC has a large computational complexity. When the signal has large energy over a wideband in the frequency domain, the operation time increases because MUSIC estimates the direction of the sound source at every frequency. Therefore, we attempted to reduce the operation time by using the sparseness of speech signals in which the sound energies exist sparsely in the time-frequency domain. We chose frequencies with large power from frequency characteristics of the observed signals. Sound source localization was performed in the chosen frequencies. The operation time of the proposed method was about 8.1 times faster than the case of utilizing all frequencies.

Full Text