Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Selective Auditory Attention Detection Using Combined Transformer and Convolutional Graph Neural Networks

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Attention is one of many human cognitive functions that are essential in everyday life. Given our limited processing capacity, attention helps us focus only on what matters. Focusing attention on one speaker in an environment with many speakers is a critical ability of the human auditory system. This paper proposes a new end-to-end method based on the combined transformer and graph convolutional neural network (TraGCNN) that can effectively detect auditory attention from electroencephalograms (EEGs). This approach eliminates the need for manual feature extraction, which is often time-consuming and subjective. Here, the first EEG signals are converted to graphs. We then extract attention information from these graphs using spatial and temporal approaches. Finally, our models are trained with these data. Our model can detect auditory attention in both the spatial and temporal domains. Here, the EEG input is first processed by transformer layers to obtain a sequential representation of EEG based on attention onsets. Then, a family of graph convolutional layers is used to find the most active electrodes using the spatial position of electrodes. Finally, the corresponding EEG features of active electrodes are fed into the graph attention layers to detect auditory attention. The Fuglsang 2020 dataset is used in the experiments to train and test the proposed and baseline systems. The new TraGCNN approach, as compared with state-of-the-art attention classification methods from the literature, yields the highest performance in terms of accuracy (80.12%) as a classification metric. Additionally, the proposed model results in higher performance than our previously graph-based model for different lengths of EEG segments. The new TraGCNN approach is advantageous because attenuation detection is achieved from EEG signals of subjects without requiring speech stimuli, as is the case with conventional auditory attention detection methods. Furthermore, examining the proposed model for different lengths of EEG segments shows that the model is faster than our previous graph-based detection method in terms of computational complexity. The findings of this study have important implications for the understanding and assessment of auditory attention, which is crucial for many applications, such as brain–computer interface (BCI) systems, speech separation, and neuro-steered hearing aid development.

Similar Papers
  • Research Article
  • Cite Count Icon 4
  • 10.7717/peerj-cs.2394
Auditory-GAN: deep learning framework for improved auditory spatial attention detection.
  • Oct 30, 2024
  • PeerJ. Computer science
  • Tasleem Kausar + 6 more

Recent advances in auditory attention detection from multichannel electroencephalography (EEG) signals encounter the challenges of the scarcity of available online EEG data and the detection of auditory attention with low latency. To this end, we propose a complete deep auditory generative adversarial network auxiliary, named auditory-GAN, designed to handle these challenges while generating EEG data and executing auditory spatial detection. The proposed auditory-GAN system consists of a spectro-spatial feature extraction (SSF) module and an auditory generative adversarial network auxiliary (AD-GAN) classifier. The SSF module extracts the spatial feature maps by learning the topographic specificity of alpha power from EEG signals. The designed AD-GAN network addresses the need for extensive training data by synthesizing augmented versions of original EEG data. We validated the proposed method on the widely used KUL dataset. The model assesses the quality of generated EEG images and the accuracy of auditory spatial attention detection. Results show that the proposed auditory-GAN can produce convincing EEG data and achieves a significant i.e., 98.5% spatial attention detection accuracy for a 10-s decision window of 64-channel EEG data. Comparative analysis reveals that the proposed neural approach outperforms existing state-of-the-art models across EEG data ranging from 64 to 32 channels. The Auditory-GAN model is available at https://github.com/tasleem-hello/Auditory-GAN-/tree/Auditory-GAN.

  • Research Article
  • Cite Count Icon 28
  • 10.1016/j.neunet.2024.106580
DGSD: Dynamical graph self-distillation for EEG-based auditory spatial attention detection
  • Jul 26, 2024
  • Neural Networks
  • Cunhang Fan + 7 more

DGSD: Dynamical graph self-distillation for EEG-based auditory spatial attention detection

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 46
  • 10.1109/tbme.2023.3294242
Brain Topology Modeling With EEG-Graphs for Auditory Spatial Attention Detection.
  • Jan 1, 2024
  • IEEE transactions on bio-medical engineering
  • Siqi Cai + 2 more

Despite recent advances, the decoding of auditory attention from brain signals remains a challenge. A key solution is the extraction of discriminative features from high-dimensional data, such as multi-channel electroencephalography (EEG). However, to our knowledge, topological relationships between individual channels have not yet been considered in any study. In this work, we introduced a novel architecture that exploits the topology of the human brain to perform auditory spatial attention detection (ASAD) from EEG signals. We propose EEG-Graph Net, an EEG-graph convolutional network, which employs a neural attention mechanism. This mechanism models the topology of the human brain in terms of the spatial pattern of EEG signals as a graph. In the EEG-Graph, each EEG channel is represented by a node, while the relationship between two EEG channels is represented by an edge between the respective nodes. The convolutional network takes the multi-channel EEG signals as a time series of EEG-graphs and learns the node and edge weights from the contribution of the EEG signals to the ASAD task. The proposed architecture supports the interpretation of the experimental results by data visualization. We conducted experiments on two publicly available databases. The experimental results showed that EEG-Graph Net significantly outperforms the state-of-the-art methods in terms of decoding performance. In addition, the analysis of the learned weight patterns provides insights into the processing of continuous speech in the brain and confirms findings from neuroscientific studies. We showed that modeling brain topology with EEG-graphs yields highly competitive results for auditory spatial attention detection. The proposed EEG-Graph Net is more lightweight and accurate than competing baselines and provides explanations for the results. Also, the architecture can be easily transferred to other brain-computer interface (BCI) tasks.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 52
  • 10.1038/s41598-024-58886-y
A GRU–CNN model for auditory attention detection using microstate and recurrence quantification analysis
  • Apr 17, 2024
  • Scientific Reports
  • Mohammadreza Eskandarinasab + 3 more

Attention as a cognition ability plays a crucial role in perception which helps humans to concentrate on specific objects of the environment while discarding others. In this paper, auditory attention detection (AAD) is investigated using different dynamic features extracted from multichannel electroencephalography (EEG) signals when listeners attend to a target speaker in the presence of a competing talker. To this aim, microstate and recurrence quantification analysis are utilized to extract different types of features that reflect changes in the brain state during cognitive tasks. Then, an optimized feature set is determined by employing the processes of significant feature selection based on classification performance. The classifier model is developed by hybrid sequential learning that employs Gated Recurrent Units (GRU) and Convolutional Neural Network (CNN) into a unified framework for accurate attention detection. The proposed AAD method shows that the selected feature set achieves the most discriminative features for the classification process. Also, it yields the best performance as compared with state-of-the-art AAD approaches from the literature in terms of various measures. The current study is the first to validate the use of microstate and recurrence quantification parameters to differentiate auditory attention using reinforcement learning without access to stimuli.

  • Research Article
  • Cite Count Icon 12
  • 10.1016/j.apacoust.2020.107826
Supervised binaural source separation using auditory attention detection in realistic scenarios
  • Dec 17, 2020
  • Applied Acoustics
  • Sahar Zakeri + 1 more

Supervised binaural source separation using auditory attention detection in realistic scenarios

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 25
  • 10.3389/fnins.2021.652058
Auditory Attention Detection via Cross-Modal Attention.
  • Jul 21, 2021
  • Frontiers in Neuroscience
  • Siqi Cai + 3 more

Humans show a remarkable perceptual ability to select the speech stream of interest among multiple competing speakers. Previous studies demonstrated that auditory attention detection (AAD) can infer which speaker is attended by analyzing a listener's electroencephalography (EEG) activities. However, previous AAD approaches perform poorly on short signal segments, more advanced decoding strategies are needed to realize robust real-time AAD. In this study, we propose a novel approach, i.e., cross-modal attention-based AAD (CMAA), to exploit the discriminative features and the correlation between audio and EEG signals. With this mechanism, we hope to dynamically adapt the interactions and fuse cross-modal information by directly attending to audio and EEG features, thereby detecting the auditory attention activities manifested in brain signals. We also validate the CMAA model through data visualization and comprehensive experiments on a publicly available database. Experiments show that the CMAA achieves accuracy values of 82.8, 86.4, and 87.6% for 1-, 2-, and 5-s decision windows under anechoic conditions, respectively; for a 2-s decision window, it achieves an average of 84.1% under real-world reverberant conditions. The proposed CMAA network not only achieves better performance than the conventional linear model, but also outperforms the state-of-the-art non-linear approaches. These results and data visualization suggest that the CMAA model can dynamically adapt the interactions and fuse cross-modal information by directly attending to audio and EEG features in order to improve the AAD performance.

  • Research Article
  • Cite Count Icon 15
  • 10.1109/embc46164.2021.9630508
Auditory Attention Detection with EEG Channel Attention.
  • Nov 1, 2021
  • Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
  • Enze Su + 4 more

Auditory attention detection (AAD) seeks to detect the attended speech from EEG signals in a multi-talker scenario, i.e. cocktail party. As the EEG channels reflect the activities of different brain areas, a task-oriented channel selection technique improves the performance of brain-computer interface applications. In this study, we propose a soft channel attention mechanism, instead of hard channel selection, that derives an EEG channel mask by optimizing the auditory attention detection task. The neural AAD system consists of a neural channel attention mechanism and a convolutional neural network (CNN) classifier. We evaluate the proposed framework on a publicly available database. We achieve 88.3% and 77.2% for 2-second and 0.1-second decision windows with 64-channel EEG; and 86.1% and 83.9% for 2-second decision windows with 32-channel and 16-channel EEG, respectively. The proposed framework outperforms other competitive models by a large margin across all test cases.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 48
  • 10.3389/fcomp.2021.661178
EEG-Based Auditory Attention Detection and Its Possible Future Applications for Passive BCI
  • Apr 30, 2021
  • Frontiers in Computer Science
  • Joan Belo + 2 more

The ability to discriminate and attend one specific sound source in a complex auditory environment is a fundamental skill for efficient communication. Indeed, it allows us to follow a family conversation or discuss with a friend in a bar. This ability is challenged in hearing-impaired individuals and more precisely in those with a cochlear implant (CI). Indeed, due to the limited spectral resolution of the implant, auditory perception remains quite poor in a noisy environment or in presence of simultaneous auditory sources. Recent methodological advances allow now to detect, on the basis of neural signals, which auditory stream within a set of multiple concurrent streams an individual is attending to. This approach, called EEG-based auditory attention detection (AAD), is based on fundamental research findings demonstrating that, in a multi speech scenario, cortical tracking of the envelope of the attended speech is enhanced compared to the unattended speech. Following these findings, other studies showed that it is possible to use EEG/MEG (Electroencephalography/Magnetoencephalography) to explore auditory attention during speech listening in a Cocktail-party-like scenario. Overall, these findings make it possible to conceive next-generation hearing aids combining customary technology and AAD. Importantly, AAD has also a great potential in the context of passive BCI, in the educational context as well as in the context of interactive music performances. In this mini review, we firstly present the different approaches of AAD and the main limitations of the global concept. We then expose its potential applications in the world of non-clinical passive BCI.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 5
  • 10.3390/s21020531
Implementation of an Online Auditory Attention Detection Model with Electroencephalography in a Dichotomous Listening Experiment
  • Jan 13, 2021
  • Sensors (Basel, Switzerland)
  • Seung-Cheol Baek + 2 more

Auditory attention detection (AAD) is the tracking of a sound source to which a listener is attending based on neural signals. Despite expectation for the applicability of AAD in real-life, most AAD research has been conducted on recorded electroencephalograms (EEGs), which is far from online implementation. In the present study, we attempted to propose an online AAD model and to implement it on a streaming EEG. The proposed model was devised by introducing a sliding window into the linear decoder model and was simulated using two datasets obtained from separate experiments to evaluate the feasibility. After simulation, the online model was constructed and evaluated based on the streaming EEG of an individual, acquired during a dichotomous listening experiment. Our model was able to detect the transient direction of a participant’s attention on the order of one second during the experiment and showed up to 70% average detection accuracy. We expect that the proposed online model could be applied to develop adaptive hearing aids or neurofeedback training for auditory attention and speech perception.

  • Research Article
  • Cite Count Icon 20
  • 10.1109/thms.2022.3176212
A Neural-Inspired Architecture for EEG-Based Auditory Attention Detection
  • Aug 1, 2022
  • IEEE Transactions on Human-Machine Systems
  • Siqi Cai + 4 more

Humans have the ability to focus on one of the sound sources in a noisy scene, which is critical for everyday communication. Auditory attention detection (AAD) seeks to detect selective attention from one’s brain signals. For AAD to be useful in brain–computer interface applications, new approaches with low computational cost, high classification performance, and low latency are required to be developed. In this study, we proposed a novel neural-inspired architecture to mimic the neural computation and coding strategy in the brain for electroencephalography-based AAD. We validated our model through data visualization, and conducted experiments on two publicly available databases. For both KUL and DTU databases, it outperforms both linear and convolutional neural network (CNN) models with consistent improvements from 1 s to 5 s decision windows in terms of detection accuracy. Although the accuracy of the proposed neural-inspired model is inferior to the state-of-the-art spatio-spectral feature (SSF)-CNN model, the computational cost of our model is less than 1% of SSF-CNN’s. Moreover, the neural-inspired decoder is more hardware friendly and energy-efficient due to its biological computing scheme. Overall, the proposed neural-inspired architecture realizes a fast, accurate, and low energy expenditure AAD, which is a big step forward towards practical neuro-steered hearing aids.

  • Research Article
  • Cite Count Icon 33
  • 10.1016/j.neunet.2022.05.003
A neuroscience-inspired spiking neural network for EEG-based auditory spatial attention detection
  • May 11, 2022
  • Neural Networks
  • Faramarz Faghihi + 2 more

A neuroscience-inspired spiking neural network for EEG-based auditory spatial attention detection

  • Research Article
  • Cite Count Icon 41
  • 10.1016/j.ymeth.2022.04.009
Decoding selective auditory attention with EEG using a transformer model.
  • Aug 1, 2022
  • Methods
  • Zihao Xu + 5 more

Decoding selective auditory attention with EEG using a transformer model.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 141
  • 10.1109/tbme.2022.3140246
STAnet: A Spatiotemporal Attention Network for Decoding Auditory Spatial Attention From EEG.
  • Jul 1, 2022
  • IEEE Transactions on Biomedical Engineering
  • Enze Su + 4 more

Humans are able to localize the source of a sound. This enables them to direct attention to a particular speaker in a cocktail party. Psycho-acoustic studies show that the sensory cortices of the human brain respond to the location of sound sources differently, and the auditory attention itself is a dynamic and temporally based brain activity. In this work, we seek to build a computational model which uses both spatial and temporal information manifested in EEG signals for auditory spatial attention detection (ASAD). We propose an end-to-end spatiotemporal attention network, denoted as STAnet, to detect auditory spatial attention from EEG. The STAnet is designed to assign differentiated weights dynamically to EEG channels through a spatial attention mechanism, and to temporal patterns in EEG signals through a temporal attention mechanism. We report the ASAD experiments on two publicly available datasets. The STAnet outperforms other competitive models by a large margin under various experimental conditions. Its attention decision for 1-second decision window outperforms that of the state-of-the-art techniques for 10-second decision window. Experimental results also demonstrate that the STAnet achieves competitive performance on EEG signals ranging from 64 to as few as 16 channels. This study provides evidence suggesting that efficient low-density EEG online decoding is within reach. This study also marks an important step towards the practical implementation of ASAD in real life applications.

  • Research Article
  • Cite Count Icon 16
  • 10.1109/tnsre.2023.3291239
Cortical Auditory Attention Decoding During Music and Speech Listening.
  • Jan 1, 2023
  • IEEE Transactions on Neural Systems and Rehabilitation Engineering
  • Adéle Simon + 3 more

It has been demonstrated that from cortical recordings, it is possible to detect which speaker a person is attending in a cocktail party scenario. The stimulus reconstruction approach, based on linear regression, has been shown to be useable to reconstruct an approximation of the envelopes of the sounds attended to and not attended to by a listener from the electroencephalogram data (EEG). Comparing the reconstructed envelopes with the envelopes of the stimuli, a higher correlation between the envelopes of the attended sound is observed. Most of the studies focused on speech listening, and only a few studies investigated the performances and the mechanisms of auditory attention decoding during music listening. In the present study, auditory attention detection (AAD) techniques that have been proven successful for speech listening were applied to a situation where the listener is actively listening to music concomitant with a distracting sound. Results show that AAD can be successful for both speech and music listening while showing differences in the reconstruction accuracy. The results of this study also highlighted the importance of the training data used in the construction of the model. This study is a first attempt to decode auditory attention from EEG data in situations where music and speech are present. The results of this study indicate that linear regression can also be used for AAD when listening to music if the model is trained for musical signals.

  • Research Article
  • Cite Count Icon 9
  • 10.1088/1741-2552/ad4f1a
Attention-guided graph structure learning network for EEG-enabled auditory attention detection
  • May 30, 2024
  • Journal of Neural Engineering
  • Xianzhang Zeng + 2 more

Objective: Decoding auditory attention from brain signals is essential for the development of neuro-steered hearing aids. This study aims to overcome the challenges of extracting discriminative feature representations from electroencephalography (EEG) signals for auditory attention detection (AAD) tasks, particularly focusing on the intrinsic relationships between different EEG channels. Approach: We propose a novel attention-guided graph structure learning network, AGSLnet, which leverages potential relationships between EEG channels to improve AAD performance. Specifically, AGSLnet is designed to dynamically capture latent relationships between channels and construct a graph structure of EEG signals. Main result: We evaluated AGSLnet on two publicly available AAD datasets and demonstrated its superiority and robustness over state-of-the-art models. Visualization of the graph structure trained by AGSLnet supports previous neuroscience findings, enhancing our understanding of the underlying neural mechanisms. Significance: This study presents a novel approach for examining brain functional connections, improving AAD performance in low-latency settings, and supporting the development of neuro-steered hearing aids.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant