Abstract

EEG-based emotion recognition has become an important task in affective computing and intelligent interaction. However, how to effectively combine the spatial, spectral, and temporal distinguishable information of EEG signals to achieve better emotion recognition performance is still a challenge. In this paper, we propose a novel attention-based convolutional transformer neural network (ACTNN), which effectively integrates the crucial spatial, spectral, and temporal information of EEG signals, and cascades convolutional neural network and transformer in a new way for emotion recognition task. We first organized EEG signals into spatial–spectral–temporal representations. To enhance the distinguishability of features, spatial and spectral attention masks are learned for the representation of each time slice. Then, a convolutional module is used to extract local spatial and spectral features. Finally, we concatenate the features of all time slices, and feed them into the transformer-based temporal encoding layer to use multi-head self-attention for global feature awareness. The average recognition accuracy of the proposed ACTNN on two public datasets, namely SEED and SEED-IV, is 98.47% and 91.90% respectively, outperforming the state-of-the-art methods. Besides, to explore the underlying reasoning process of the model and its neuroscience relevance with emotion, we further visualize spatial and spectral attention masks. The attention weight distribution shows that the activities of prefrontal lobe and lateral temporal lobe of the brain, and the gamma band of EEG signals might be more related to human emotion. The proposed ACTNN can be employed as a promising framework for EEG emotion recognition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call