Detecting the locus of auditory attention based on the spectro-spatial-temporal analysis of EEG

Yifan Jiang,Jing Jin,Ning Chen

doi:10.1088/1741-2552/ac975c

Yifan Jiang, Jing Jin + Show 1 more

https://doi.org/10.1088/1741-2552/ac975c

Copy DOI

Abstract

Objective. Auditory attention decoding (AAD) determines which speaker the listener is focusing on by analyzing his/her EEG. Convolutional neural network (CNN) was adopted to extract spectro-spatial-feature (SSF) from short-time-interval of EEG to detect auditory spatial attention without stimuli. However, the following factors are not considered in SSF-CNN scheme. (a) Single-band frequency analysis cannot represent the EEG pattern precisely. (b) The power cannot represent the EEG feature related to the dynamic patterns of the attended auditory stimulus. (c) The temporal feature of EEG representing the relationship between EEG and attended stimulus is not extracted. To solve these problems, SSF-CNN scheme was modified. Approach. (a) Multiple-frequency bands, but not a single alpha frequency band, of EEG, were analyzed to represent the EEG pattern more precisely. (b) Differential entropy, but not power, was extracted from each frequency band to represent the disorder degree of EEG, which was related to the dynamic patterns of the attended auditory stimulus. (c) CNN and convolutional-long-short-term-memory (ConvLSTM) were combined to extract spectro-spatial-temporal features from the 3D descriptor sequence constructed based on the topographical activity maps of multiple-frequency bands. Main results. Experimental results on KUL, DTU, and PKU with 0.1 s, 1 s, 2 s, and 5 s decision windows demonstrated that: (a) The proposed model outperformed SSF-CNN and state-of-the-art AAD models. Specifically, when the auditory stimulus was unavailable, AAD accuracy could be enhanced by at least , and on KUL, DTU, and PKU, respectively, compared with the baselines. And, on KUL, the longer decision window corresponded to lower enhancement, while on both DTU and PKU, the longer decision window corresponded to higher enhancement, except for two cases when decision window length was 2 s on PKU or 5 s on DTU. (b) Each modification contributed to the performance enhancement. Significance. DE feature, multi-band frequency analysis, and ConvLSTM-based temporal analysis help to enhance AAD accuracy.

Full Text