A Gestalt inference model for auditory scene segregation.

Debmalya Chakrabarty,Mounya Elhilali

doi:10.1371/journal.pcbi.1006711

Abstract

Our current understanding of how the brain segregates auditory scenes into meaningful objects is in line with a Gestaltism framework. These Gestalt principles suggest a theory of how different attributes of the soundscape are extracted then bound together into separate groups that reflect different objects or streams present in the scene. These cues are thought to reflect the underlying statistical structure of natural sounds in a similar way that statistics of natural images are closely linked to the principles that guide figure-ground segregation and object segmentation in vision. In the present study, we leverage inference in stochastic neural networks to learn emergent grouping cues directly from natural soundscapes including speech, music and sounds in nature. The model learns a hierarchy of local and global spectro-temporal attributes reminiscent of simultaneous and sequential Gestalt cues that underlie the organization of auditory scenes. These mappings operate at multiple time scales to analyze an incoming complex scene and are then fused using a Hebbian network that binds together coherent features into perceptually-segregated auditory objects. The proposed architecture successfully emulates a wide range of well established auditory scene segregation phenomena and quantifies the complimentary role of segregation and binding cues in driving auditory scene segregation.

Highlights

We live in busy environments, and our surrounds continuously flood our sensory system with complex information that needs to be analyzed in order to make sense of the world around us
A number of Gestalt principles have been posited as indispensable anchors used by the brain to guide the segregation of auditory scenes into perceptually meaningful objects [8, 47, 58]
These comprise a wide variety of cues; for instance harmonicity which couples harmonicallyrelated frequency channels together, common fate which favors sound elements that co-vary in amplitude, and common onsets which groups components that share a similar starting time and to a lesser degree a common ending time

Summary

Introduction

We live in busy environments, and our surrounds continuously flood our sensory system with complex information that needs to be analyzed in order to make sense of the world around us This process, labeled scene analysis, is common across all sensory modalities including vision, audition and olfaction [1]. Our brain relies on innate dispositions that aid this process and help guide the organization of patterns into perceived objects [2] These dispositions, referred to as Gestalt principles, inform our current understanding of the perceptual organization of scenes [3, 4]. The sensory mixture is decomposed into feature elements, believed to be the building blocks of the scene These features reflect the physical nature of sources in the scene, the state and structure of the environment itself, as well as perceptual mappings of these attributes as viewed by the sensory system. This segregation stage is modeled using feature analyses which map the sensory signal into its building blocks ranging from simple components (e.g. frequency channels) to dimensionally-complex kernels [6, 7]

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS Computational Biology	Publication Date: Jan 22, 2019
Citations: 18	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Gestalt inference model for auditory scene segregation.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

Masking Feedforward Neural Networks Against Power Analysis Attacks
Konstantinos Athanasiou ... A Adam Ding
Proceedings on Privacy Enhancing Technologies | VOL. 2022
Konstantinos Athanasiou, et. al.Konstantinos Athanasiou ... A Adam Ding
20 Nov 2021
Proceedings on Privacy Enhancing Technologies | VOL. 2022

Auditory textures and primitive auditory scene analysis
Sahani Maneesh
Frontiers in Neuroscience | VOL. 4
Sahani ManeeshSahani Maneesh
01 Jan 2009
Frontiers in Neuroscience | VOL. 4

Sustained firing of model central auditory neurons yields a discriminative spectro-temporal representation for natural sounds.
Michael A Carlin ... Mounya Elhilali
PLoS computational biology | VOL. 9
Michael A Carlin, et. al.Michael A Carlin ... Mounya Elhilali
28 Mar 2013
PLoS computational biology | VOL. 9

Research and Implementation of High Computational Power for Training and Inference of Convolutional Neural Networks
Tianling Li ... Yangyang Zheng
Applied Sciences | VOL. 13
Tianling Li, et. al.Tianling Li ... Yangyang Zheng
11 Jan 2023
Applied Sciences | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Gestalt inference model for auditory scene segregation.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: PLOS Computational Biology