Brain-optimized extraction of complex sound features that drive continuous auditory perception.

Julia Berezutskaya,Nick F Ramsey,Umut Güçlü,Marcel A J Van Gerven,Zachary V Freudenburg

doi:10.1371/journal.pcbi.1007992

Abstract

Understanding how the human brain processes auditory input remains a challenge. Traditionally, a distinction between lower- and higher-level sound features is made, but their definition depends on a specific theoretical framework and might not match the neural representation of sound. Here, we postulate that constructing a data-driven neural model of auditory perception, with a minimum of theoretical assumptions about the relevant sound features, could provide an alternative approach and possibly a better match to the neural responses. We collected electrocorticography recordings from six patients who watched a long-duration feature film. The raw movie soundtrack was used to train an artificial neural network model to predict the associated neural responses. The model achieved high prediction accuracy and generalized well to a second dataset, where new participants watched a different film. The extracted bottom-up features captured acoustic properties that were specific to the type of sound and were associated with various response latency profiles and distinct cortical distributions. Specifically, several features encoded speech-related acoustic properties with some features exhibiting shorter latency profiles (associated with responses in posterior perisylvian cortex) and others exhibiting longer latency profiles (associated with responses in anterior perisylvian cortex). Our results support and extend the current view on speech perception by demonstrating the presence of temporal hierarchies in the perisylvian cortex and involvement of cortical sites outside of this region during audiovisual speech perception.

Highlights

Our understanding of how the human brain processes auditory input remains incomplete
A deep artificial neural networks (ANNs) was trained on the raw soundtrack of the movie to predict the associated ECoG responses in the high frequency band (HFB, 60–95 Hz) [24]
We confirmed that our brain-optimized ANN (BO-NN, Fig 1A) model could be successfully applied to a dataset of different participants watching a different audiovisual film (Movie II)

Summary

Introduction

Our understanding of how the human brain processes auditory input remains incomplete. Our aim is to identify the features that different cortical regions extract from the incoming sound signal, and to understand how they are transformed into high-level representations specific to sound type (speech, music, noise, etc.). Addressing higher-level features has been attempted in neural encoding models of sound processing [8,9], but higher levels of auditory processing are generally more difficult to model because their characteristics (e.g. in speech or music) remain a topic of theoretical investigation. Higher-level features typically require some form of interpretation and labelling that is based on theoretical constructs and may not match cortical representations. Little is known about the mechanisms underlying the transition from lower- to higher-level auditory processing, leaving these levels of explanation disconnected

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS Computational Biology	Publication Date: Jul 2, 2020
Citations: 18	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Brain-optimized extraction of complex sound features that drive continuous auditory perception.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: PLOS Computational Biology

Lead the way for us

Similar Papers

Brain-optimized extraction of complex sound features that drive continuous auditory perception
Zachary V Freudenburg ... Julia Berezutskaya
-
Zachary V Freudenburg, et. al.Zachary V Freudenburg ... Julia Berezutskaya
02 Jul 2020
02 Jul 2020

Predicting Audiovisual Word Recognition in Noisy Situations: Toward Precision Audiology.
Joel Myerson ... Brent Spehar
Ear and hearing | VOL. 42
Joel Myerson, et. al.Joel Myerson ... Brent Spehar
27 Jul 2021
Ear and hearing | VOL. 42

When half a face is as good as a whole: Effects of simple substantial occlusion on visual and audiovisual speech perception
Timothy R Jordan ... Sharon M Thomas
Attention, Perception, & Psychophysics | VOL. 73
Timothy R Jordan, et. al.Timothy R Jordan ... Sharon M Thomas
13 Aug 2011
Attention, Perception, & Psychophysics | VOL. 73

Audiovisual speech perception and word recognition
Dominic W Massaro ... Alexandra Jesse
-
Dominic W Massaro, et. al.Dominic W Massaro ... Alexandra Jesse
02 Aug 2007
02 Aug 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Brain-optimized extraction of complex sound features that drive continuous auditory perception.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: PLOS Computational Biology