Abstract

A computational auditory model for the direction estimation of concurrent speakers in the horizontal plane is presented. The auditory signal processing is motivated by mammalian physiology and human psychoacoustic data, extending the binaural model by Dietz et al. [Brain Res., 1220, 234–245 (2008)]. The model exploits a wide range of interaural parameters, namely, the interaural level difference, the interaural phase difference of the fine structure, and the interaural coherence. The parameters are determined at the output of auditory filters allowing for a simultaneous estimation of several sound source directions in a multidimensional internal representation. By focusing on time segments with high interaural coherence [Faller and Merimaa, J. Acoust. Soc. Am. 116, 3075–3089 (2004)] in the auditory bands, which likely stem from a localized sound source, the robustness of the direction estimation of the model was considerably improved: for two well‐separated speakers in quiet, the estimation error is generally less than or equal to 10 deg. In noisy environments, the direction of a single speaker can be estimated for signal‐to‐noise ratios below 0 dB. Further improvements of the model by concurrent pitch estimation to reduce permutation errors between nearby speakers are investigated and discussed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call