Auditory spatial attention detection (ASAD) aims to decipher the spatial locus of a listener’s selective auditory attention from electroencephalogram (EEG) signals. However, current models may exhibit deficiencies in EEG feature extraction, leading to overfitting on small datasets or a decline in EEG discriminability. Furthermore, they often neglect topological relationships between EEG channels and, consequently, brain connectivities. Although graph-based EEG modeling has been employed in ASAD, effectively incorporating both local and global connectivities remains a great challenge. To address these limitations, we propose a new ASAD model. First, time-frequency feature fusion provides a more precise and discriminative EEG representation. Second, EEG segments are treated as graphs, and the graph convolution and global attention mechanism are leveraged to capture local and global brain connections, respectively. A series of experiments are conducted in a leave-trials-out cross-validation manner. On the MAD-EEG and KUL datasets, the accuracies of the proposed model are more than 9% and 3% higher than those of the corresponding state-of-the-art models, respectively, while the accuracy of the proposed model on the SNHL dataset is roughly comparable to that of the state-of-the-art model. EEG time-frequency feature fusion proves to be indispensable in the proposed model. EEG electrodes over the frontal cortex are most important for ASAD tasks, followed by those over the temporal lobe. Additionally, the proposed model performs well even on small datasets. This study contributes to a deeper understanding of the neural encoding related to human hearing and attention, with potential applications in neuro-steered hearing devices.
Read full abstract