Active binaural localization of multiple sound sources

Xuan Zhong,Liang Sun,William Yost

doi:10.1016/j.robot.2016.07.008

Xuan Zhong, Liang Sun + Show 1 more

Open Access

https://doi.org/10.1016/j.robot.2016.07.008

Copy DOI

Abstract

Sound source localization serves as a significant capability of autonomous robots that conduct missions such as search and rescue, and target tracking in challenging environments. However, localization of multiple sound sources and static sound source tracking in self-motion are both difficult tasks, especially when the number of sound sources or reflections increase. This study presents two robotic hearing approaches based on a human perception model (Wallach, 1939) that combines interaural time difference (ITD) and head turn motion data to locate sound sources. The first method uses a fitting-based approach to recognize the changing trends of the cross-correlation function of binaural inputs. The effectiveness of this method was validated using data collected from a two-microphone array rotating in a non-anechoic environment, and the experiments reveal its ability to separate and localize up to three sound sources of the same spectral content (white noise) at different azimuth and elevation angles. The second method uses an extended Kalman filter (EKF) that estimates the orientation of a sound source by fusing the robot’s self-motion and ITD data to reduce the localization errors recursively. This method requires limited memory resources and is able to keep tracking the relative position change of a number of static sources when the robot moves. In the experiments, up to three sources can be tracked simultaneously with a two-microphone array.

Full Text