Abstract
Neural networks were trained with back propagation to localize ‘‘virtual’’ sounds that could originate from any of 24 azimuths (−165° to +180°) and 6 elevations (−36° to +54°). The sounds (clicks) were filtered with head related transfer functions and transformed into 22-point quarter-octave spectra. The networks were composed of 22 to 44 input nodes, followed by 50 hidden nodes and 30 output nodes (24 azimuth nodes and 6 elevation nodes). With the ‘‘binaural’’ configuration, the interaural delay and the interaural difference spectrum were provided as inputs to the network. With the ‘‘monaural’’ configuration, separate networks were trained with the left-ear and the right-ear spectra; a third, arbitrator, network learned to localize based on the output of these two monaural networks (i.e., the activation levels of their azimuth and elevation nodes). The monaural configuration did not allow for binaural interaction in the traditional sense. Both configurations achieved performance comparable to humans, suggesting that for these conditions either monaural cues or binaural cues are sufficient to explain human sound localization. [Work supported by NIH-DC-00786, AFOSR-91-0289, and AFOSR-Task 2313V3.]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.