A Speech-Level-Based Segmented Model to Decode the Dynamic Auditory Attention States in the Competing Speaker Scenes.

Lei Wang,Ed X. Wu,Zhixing Liu,Yihan Wang,Fei Chen

doi:10.3389/fnins.2021.760611

Lei Wang, Ed X. Wu + Show 3 more

Open Access

PDF Available

https://doi.org/10.3389/fnins.2021.760611

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

In the competing speaker environments, human listeners need to focus or switch their auditory attention according to dynamic intentions. The reliable cortical tracking ability to the speech envelope is an effective feature for decoding the target speech from the neural signals. Moreover, previous studies revealed that the root mean square (RMS)–level–based speech segmentation made a great contribution to the target speech perception with the modulation of sustained auditory attention. This study further investigated the effect of the RMS-level–based speech segmentation on the auditory attention decoding (AAD) performance with both sustained and switched attention in the competing speaker auditory scenes. Objective biomarkers derived from the cortical activities were also developed to index the dynamic auditory attention states. In the current study, subjects were asked to concentrate or switch their attention between two competing speaker streams. The neural responses to the higher- and lower-RMS-level speech segments were analyzed via the linear temporal response function (TRF) before and after the attention switching from one to the other speaker stream. Furthermore, the AAD performance decoded by the unified TRF decoding model was compared to that by the speech-RMS-level–based segmented decoding model with the dynamic change of the auditory attention states. The results showed that the weight of the typical TRF component approximately 100-ms time lag was sensitive to the switching of the auditory attention. Compared to the unified AAD model, the segmented AAD model improved attention decoding performance under both the sustained and switched auditory attention modulations in a wide range of signal-to-masker ratios (SMRs). In the competing speaker scenes, the TRF weight and AAD accuracy could be used as effective indicators to detect the changes of the auditory attention. In addition, with a wide range of SMRs (i.e., from 6 to –6 dB in this study), the segmented AAD model showed the robust decoding performance even with short decision window length, suggesting that this speech-RMS-level–based model has the potential to decode dynamic attention states in the realistic auditory scenarios.

Highlights

In a competing speaker environment, the target speech perception relies on the modulation of selective auditory attention
Many studies indicated that the temporal response function (TRF) response obtained from the target speech streams contained biomarkers that could estimate the switching of the auditory attention states (e.g., Akram et al, 2016; Miran et al, 2020)
ANOVA results revealed that a main effect for different root mean square (RMS)-level–based segments [F(1,15) = 16.77, P = 0.01, η2p = 0.53] and attention switching [F(1,15) = 22.43, P < 0.001, η2p = 0.60] with a significant interaction effect between these two factors [F(1,15) = 14.25, P = 0.002, η2p = 0.49]. These results suggested that the first positive components of the TRF response were larger with lowerRMS-level speech segments than with higher-RMS-level speech segments, and the TRF amplitudes in the first positive deflection were decreased after the switching of the auditory attention from one speaker stream to the other

Summary

Introduction

In a competing speaker environment, the target speech perception relies on the modulation of selective auditory attention. Some studies suggested that, in the dynamic auditory scenes, the salient speech features played an important role in the target speech perception through the bottom-up auditory pathways (Kaya and Elhilali, 2014; Shuai and Elhilali, 2014) It remains unknown whether the dynamic change of the auditory attention states can be reliably decoded from the cortical signals when subjects focus their attention to the natural sentences in the complex auditory scenes. It needs to further uncover the underlying neural mechanisms of the sensitive tracking ability to the target speech stream in the complex auditory scenes

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in neuroscience	Publication Date: Feb 10, 2022
Citations: 2	License type: CC BY 4.0

R Discovery Prime

A Speech-Level-Based Segmented Model to Decode the Dynamic Auditory Attention States in the Competing Speaker Scenes.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Frontiers in neuroscience

Lead the way for us

Similar Papers

EEG-based auditory attention decoding using speech-level-based segmented computational models
Lei Wang ... Ed X Wu
Journal of Neural Engineering | VOL. 18
Lei Wang, et. al.Lei Wang ... Ed X Wu
25 May 2021
Journal of Neural Engineering | VOL. 18

Robust EEG-Based Decoding of Auditory Attention With High-RMS-Level Speech Segments in Noisy Conditions
Lei Wang ... Fei Chen
Frontiers in Human Neuroscience | VOL. 14
Lei Wang, et. al.Lei Wang ... Fei Chen
07 Oct 2020
Frontiers in Human Neuroscience | VOL. 14

Auditory attention decoding from electroencephalography based on long short-term memory networks
Yun Lu ... Shixiong Chen
Biomedical Signal Processing and Control | VOL. 70
Yun Lu, et. al.Yun Lu ... Shixiong Chen
15 Jul 2021
Biomedical Signal Processing and Control | VOL. 70

Impact of Different Acoustic Components on EEG-Based Auditory Attention Decoding in Noisy and Reverberant Conditions.
Ali Aroudi ... Maarten De Vos
IEEE Transactions on Neural Systems and Rehabilitation Engineering | VOL. 27
Ali Aroudi, et. al.Ali Aroudi ... Maarten De Vos
07 Mar 2019
IEEE Transactions on Neural Systems and Rehabilitation Engineering | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

A Speech-Level-Based Segmented Model to Decode the Dynamic Auditory Attention States in the Competing Speaker Scenes.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Frontiers in neuroscience