Abstract

The problem of intelligent localization and separation of multiple concurrent sound sources using only two microphones, and without usage of a visual channel, is a challenging one. It has been solved with little success today. Most of the approaches that have been presented on this subject use complex microphone arrays to localize sound in three dimensions. Psycho-physical listening tests show that humans can locate up to six sound sources simultaneously. This localization occurs not only in azimuth but also in elevation and to a certain extent in distance. In addition, humans are able to identify these sources and to separate and process the information that they provide, including the tone, pitch, and semantics. They do not use any sensor array, but only two ears. In this study, we tackle the challenging task of robotic sound source localization and separation of more than two dynamic sources using only two microphones inserted into the ear canals of two artificial ears of a humanoid head. Exploiting specific properties of the sound signals and using self-splitting competitive learning combined with Bayesian fusion, we achieved promising results compared to state-of-the-art techniques in terms of localization accuracy and separation efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call