DISAMBIGUATING SOUND THROUGH CONTEXT

Maria E Niessen,Tjeerd C Andringa,Leendert Van Maanen

doi:10.1142/s1793351x08000506

Maria E Niessen, Tjeerd C Andringa + Show 1 more

https://doi.org/10.1142/s1793351x08000506

Copy DOI

Abstract

A central problem in automatic sound recognition is the mapping between low-level audio features and the meaningful content of an auditory scene. We propose a dynamic network model to perform this mapping. In acoustics, much research is devoted to low-level perceptual abilities such as audio feature extraction and grouping, which are translated into successful signal processing techniques. However, little work is done on modeling knowledge and context in sound recognition, although this information is necessary to identify a sound event rather than to separate its components from a scene. We first investigate the role of context in human sound identification in a simple experiment. Then we show that the use of knowledge in a dynamic network model can improve automatic sound identification by reducing the search space of the low-level audio features. Furthermore, context information dissolves ambiguities that arise from multiple interpretations of one sound event.

Full Text