Attention-based convolutional neural network with multi-modal temporal information fusion for motor imagery EEG decoding

Xinzhi Ma,Weihai Chen,Zhongcai Pei,Yue Zhang,Jianer Chen

doi:10.1016/j.compbiomed.2024.108504

Abstract

Convolutional neural network (CNN) has been widely applied in motor imagery (MI)-based brain computer interface (BCI) to decode electroencephalography (EEG) signals. However, due to the limited perceptual field of convolutional kernel, CNN only extracts features from local region without considering long-term dependencies for EEG decoding. Apart from long-term dependencies, multi-modal temporal information is equally important for EEG decoding because it can offer a more comprehensive understanding of the temporal dynamics of neural processes. In this paper, we propose a novel deep learning network that combines CNN with self-attention mechanism to encapsulate multi-modal temporal information and global dependencies. The network first extracts multi-modal temporal information from two distinct perspectives: average and variance. A shared self-attention module is then designed to capture global dependencies along these two feature dimensions. We further design a convolutional encoder to explore the relationship between average-pooled and variance-pooled features and fuse them into more discriminative features. Moreover, a data augmentation method called signal segmentation and recombination is proposed to improve the generalization capability of the proposed network. The experimental results on the BCI Competition IV-2a (BCIC-IV-2a) and BCI Competition IV-2b (BCIC-IV-2b) datasets show that our proposed method outperforms the state-of-the-art methods and achieves 4-class average accuracy of 85.03% on the BCIC-IV-2a dataset. The proposed method implies the effectiveness of multi-modal temporal information fusion in attention-based deep learning networks and provides a new perspective for MI-EEG decoding. The code is available at https://github.com/Ma-Xinzhi/EEG-TransNet.

Full Text