Abstract

A computational auditory model for the direction estimation of concurrent speakers in the horizontal plane is presented. The auditory signal processing is motivated by mammalian physiology and human psychoacoustic data, extending the binaural model by Dietz et al. [Brain Res., 1220, 234–245 (2008)]. The model exploits a wide range of interaural parameters, namely, the interaural level difference, the interaural phase difference of the fine structure, and the interaural coherence. The parameters are determined at the output of auditory filters allowing for a simultaneous estimation of several sound source directions in a multidimensional internal representation. By focusing on time segments with high interaural coherence [Faller and Merimaa, J. Acoust. Soc. Am. 116, 3075–3089 (2004)] in the auditory bands, which likely stem from a localized sound source, the robustness of the direction estimation of the model was considerably improved: for two well‐separated speakers in quiet, the estimation error is generally less than or equal to 10 deg. In noisy environments, the direction of a single speaker can be estimated for signal‐to‐noise ratios below 0 dB. Further improvements of the model by concurrent pitch estimation to reduce permutation errors between nearby speakers are investigated and discussed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.