Abstract

Separating out and correctly grouping the sounds of a communicating animal from the natural acoustic environment poses significant challenges to models of auditory scene analysis, yet animals perform this task very effectively. To date, most models have focussed on simultaneous grouping cues and the segregation of discrete sound events, although some take the longer term context into account. Inspired by the important part that form plays in the segregation and recognition of visual objects, we consider the role of form in auditory scene analysis. By form in audition we mean the dynamic spectrotemporal patterns characterizing individual sound events as well as their timing with respect to each other. We present a model capable of segregating and recognizing natural communication calls within complex acoustic environments. Incoming sounds are processed using a model of the auditory periphery and fed into a recurrent neural network that rapidly tunes itself to respond preferentially to specific events. Representations of predictable patterns of events in the sequence are created on the fly and maintained on the basis of their predictive success and conflict with other representations. Activation levels of these representations are interpreted in terms of object recognition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call