An Interpretable Residual Spatio-Temporal Graph Attention Network for Multiclass Emotion Recognition from EEG
This study introduces an Interpretable Residual Spatio-Temporal Graph Attention Network (IRSTGANet) for EEG-based emotion recognition, combining temporal convolutional encoding with residual graph-attention layers. Evaluated on DEAP and SEED datasets, it outperformed state-of-the-art methods in classifying valence, arousal, and multiple emotion classes, providing both improved accuracy and interpretable insights into neural connectivity.
Automatic emotion recognition based on EEG has been a key research frontier in recent years, involving the direct extraction of emotional states from brain dynamics. However, existing deep learning approaches often treat EEG either as a sequence or as a static spatial map, thereby failing to jointly capture the temporal evolution and spatial dependencies underlying emotional responses. To address this limitation, we propose an Interpretable Residual Spatio-Temporal Graph Attention Network (IRSTGANet) that integrates temporal convolutional encoding with residual graph-attention blocks. The temporal module enhances short-term EEG dynamics, while the graph-attention layers learn adaptive node connectivity relationships and preserve contextual information through residual links. Evaluated on the DEAP and SEED datasets, the proposed model achieved exceptional performance on valence and arousal, as well as four-class and nine-class classification on the DEAP dataset and on the three-class SEED dataset, exceeding state-of-the-art methods. These results demonstrate that combining temporal enhancement with residual graph attention yields both improved recognition performance and interpretable insights into emotion-related neural connectivity.
- Research Article
88
- 10.1109/tim.2023.3277985
- Jan 1, 2023
- IEEE Transactions on Instrumentation and Measurement
Emotion recognition is important in the application of brain-computer interface (BCI). Building a robust emotion recognition model across subjects and sessions is critical in emotion based BCI systems. Electroencephalogram (EEG) is a widely used tool to recognize different emotion states. However, EEG has disadvantages such as small amplitude, low signal-to-noise ratio, and non-stationary properties, resulting in large differences across subjects. To solve these problems, this paper proposes a new emotion recognition method based on a multi-source associate domain adaptation network, considering both domain invariant and domain-specific features. First, separate branches were constructed for multiple source domains, assuming that different EEG data shared the same low-level features. Secondly, the domain specific features were extracted by using the one-to-one associate domain adaptation. Then, the weighted scores of specific sources were obtained according to the distribution distance, and multiple source classifiers were deduced with the corresponding weighted scores. Finally, EEG emotion recognition experiments were conducted on different datasets, including SEED, DEAP, and SEED-IV dataset. Results indicated that, in the cross-subject experiment, the average accuracy in SEED dataset was 86.16%, DEAP dataset was 65.59%, and SEED-IV was 59.29%. In the cross-session experiment, the accuracies of SEED and SEED-IV datasets were 91.10% and 66.68%, respectively. Our proposed method has achieved better classification results compared to state-of-the-art domain adaptation methods.
- Research Article
28
- 10.3390/sym15101822
- Sep 25, 2023
- Symmetry
Emotion recognition via electroencephalography (EEG) has been gaining increasing attention in applications such as human–computer interaction, mental health assessment, and affective computing. However, it poses several challenges, primarily stemming from the complex and noisy nature of EEG signals. Commonly adopted strategies involve feature extraction and machine learning techniques, which often struggle to capture intricate emotional nuances and may require extensive handcrafted feature engineering. To address these limitations, we propose a novel approach utilizing convolutional neural networks (CNNs) for EEG emotion recognition. Unlike traditional methods, our CNN-based approach learns discriminative cues directly from raw EEG signals, bypassing the need for intricate feature engineering. This approach not only simplifies the preprocessing pipeline but also allows for the extraction of more informative features. We achieve state-of-the-art performance on benchmark emotion datasets, namely DEAP and SEED datasets, showcasing the superiority of our approach in capturing subtle emotional cues. In particular, accuracies of 96.32% and 92.54% were achieved on SEED and DEAP datasets, respectively. Further, our pipeline is robust against noise and artefact interference, enhancing its applicability in real-world scenarios.
- Research Article
103
- 10.3390/s19235218
- Nov 28, 2019
- Sensors (Basel, Switzerland)
Much attention has been paid to the recognition of human emotions with the help of electroencephalogram (EEG) signals based on machine learning technology. Recognizing emotions is a challenging task due to the non-linear property of the EEG signal. This paper presents an advanced signal processing method using the deep neural network (DNN) for emotion recognition based on EEG signals. The spectral and temporal components of the raw EEG signal are first retained in the 2D Spectrogram before the extraction of features. The pre-trained AlexNet model is used to extract the raw features from the 2D Spectrogram for each channel. To reduce the feature dimensionality, spatial, and temporal based, bag of deep features (BoDF) model is proposed. A series of vocabularies consisting of 10 cluster centers of each class is calculated using the k-means cluster algorithm. Lastly, the emotion of each subject is represented using the histogram of the vocabulary set collected from the raw-feature of a single channel. Features extracted from the proposed BoDF model have considerably smaller dimensions. The proposed model achieves better classification accuracy compared to the recently reported work when validated on SJTU SEED and DEAP data sets. For optimal classification performance, we use a support vector machine (SVM) and k-nearest neighbor (k-NN) to classify the extracted features for the different emotional states of the two data sets. The BoDF model achieves 93.8% accuracy in the SEED data set and 77.4% accuracy in the DEAP data set, which is more accurate compared to other state-of-the-art methods of human emotion recognition.
- Research Article
71
- 10.1088/1741-2552/acb79e
- Feb 1, 2023
- Journal of Neural Engineering
Objective. Constructing an efficient human emotion recognition model based on electroencephalogram (EEG) signals is significant for realizing emotional brain–computer interaction and improving machine intelligence. Approach. In this paper, we present a spatial-temporal feature fused convolutional graph attention network (STFCGAT) model based on multi-channel EEG signals for human emotion recognition. First, we combined the single-channel differential entropy (DE) feature with the cross-channel functional connectivity (FC) feature to extract both the temporal variation and spatial topological information of EEG. After that, a novel convolutional graph attention network was used to fuse the DE and FC features and further extract higher-level graph structural information with sufficient expressive power for emotion recognition. Furthermore, we introduced a multi-headed attention mechanism in graph neural networks to improve the generalization ability of the model. Main results. We evaluated the emotion recognition performance of our proposed model on the public SEED and DEAP datasets, which achieved a classification accuracy of 99.11% ± 0.83% and 94.83% ± 3.41% in the subject-dependent and subject-independent experiments on the SEED dataset, and achieved an accuracy of 91.19% ± 1.24% and 92.03% ± 4.57% for discrimination of arousal and valence in subject-independent experiments on DEAP dataset. Notably, our model achieved state-of-the-art performance on cross-subject emotion recognition tasks for both datasets. In addition, we gained insight into the proposed frame through both the ablation experiments and the analysis of spatial patterns of FC and DE features. Significance. All these results prove the effectiveness of the STFCGAT architecture for emotion recognition and also indicate that there are significant differences in the spatial-temporal characteristics of the brain under different emotional states.
- Research Article
1
- 10.1016/j.jneumeth.2024.110276
- Sep 3, 2024
- Journal of Neuroscience Methods
Cross-subject emotion recognition in brain-computer interface based on frequency band attention graph convolutional adversarial neural networks
- Conference Article
9
- 10.1109/tencon55691.2022.9978122
- Nov 1, 2022
In this study, the continuous wavelet transform (CWT) is applied on multi-channel EEG signal to obtain wavelet coefficients in the time-frequency domain. The coefficients are considered to yield the spectral energy as input feature for detection of emotional states. Scalogram of each EEG signal covers a range of frequency components which indicates the percentage of total energy of the signal carried by each of those component. The useful spectral features obtained by wavelet scalogram of each EEG channel are aggregated to constitute the input frame of a combined neural network model comprising convolutional neural network (CNN) along with Long-short-term memory (LSTM). The spatial and temporal sequence based features are extracted simultaneously by the combined two dimensional (2D) CNN-LSTM network. This hybrid deep neural network showed notable performance on emotion detection for both binary and multi-class classification with the modeled spatial-temporal-spectral features. Performance of our proposed approach is evaluated on the publicly available DEAP and SEED dataset. On binary classification of valence and arousal state (high versus low level), the obtained accuracies are 94.36 % and 94.07 % respectively, meanwhile accuracy of 94.41 % is attained on 4-class classification with DEAP dataset. Additionally, our model achieved 97.40 % accuracy for multi-class emotion detection with SEED dataset, which significantly outperform the reported state-of-the-art systems with CNN/LSTM and/or conventional temporal and spectral features.
- Research Article
13
- 10.1016/j.jneumeth.2023.110008
- Nov 13, 2023
- Journal of Neuroscience Methods
DSE-Mixer: A pure multilayer perceptron network for emotion recognition from EEG feature maps
- Research Article
15
- 10.3390/s24248174
- Dec 21, 2024
- Sensors (Basel, Switzerland)
Recent advances in emotion recognition through Artificial Intelligence (AI) have demonstrated potential applications in various fields (e.g., healthcare, advertising, and driving technology), with electroencephalogram (EEG)-based approaches demonstrating superior accuracy compared to facial or vocal methods due to their resistance to intentional manipulation. This study presents a novel approach to enhance EEG-based emotion estimation accuracy by emphasizing temporal features and efficient parameter space exploration. We propose a model combining Long Short-Term Memory (LSTM) with an attention mechanism to highlight temporal features in EEG data while optimizing LSTM parameters through Particle Swarm Optimization (PSO). The attention mechanism assigned weights to LSTM hidden states, and PSO dynamically optimizes the vital parameters, including units, batch size, and dropout rate. Using the DEAP and SEED datasets, which serve as benchmark datasets for emotion estimation research using EEG, we evaluate the model's performance. For the DEAP dataset, we conduct a four-class classification of combinations of high and low valence and arousal states. We perform a three-class classification of negative, neutral, and positive emotions for the SEED dataset. The proposed model achieves an accuracy of 0.9409 on the DEAP dataset, surpassing the previous state-of-the-art accuracy of 0.9100 reported by Lin et al. The model attains an accuracy of 0.9732 on the SEED dataset, recording one of the highest accuracies among the related research. These results demonstrate that integrating the attention mechanism with PSO significantly improves the accuracy of EEG-based emotion estimation, contributing to the advancement of emotion recognition technology.
- Research Article
57
- 10.1016/j.compbiomed.2023.106860
- Apr 14, 2023
- Computers in Biology and Medicine
Cross-subject EEG emotion recognition using multi-source domain manifold feature selection
- Research Article
2
- 10.1016/j.mex.2025.103468
- Dec 1, 2025
- MethodsX
Classifying emotions based on EEG signals is really important for enhancing our interactions with computers, monitoring mental health and creating applications in affective computing field. This study explores improving emotion recognition performance by applying traditional machine learning classifiers and boosting techniques to EEG data from the DEAP dataset. To categorize emotional states, we used four classifiers: K-Nearest Neighbors (KNN), Support Vector Machine (SVM), XGBoost and Gradient Boosting. Differential entropy and Higuchi's fractal dimension are two important time-domain parameters that we extracted after applying a segmentation technique to capture the temporal interdependence of EEG data. These features were selected for their ability to reflect intricate neural dynamics associated with emotional processing. A five-fold cross-validation procedure was applied to estimate the model's performance and hyperparameter tuning was conducted to optimize classifier efficiency. XGBoost achieved the highest accuracy 89 % for valence and 88 % for arousal demonstrating its superior performance. Furthermore, cross-subject evaluation on the SEED dataset reinforced the approach's robustness, where XGBoost achieved 86 % accuracy using HFD and 84 % using DE. These results emphasize the effectiveness of combining advanced feature extraction methods with boosting algorithms for EEG-based emotion recognition, offering promising directions for the development of real-world emotion-aware systems. The key findings of this research are as follows:•Differential Entropy and Higuchi's Fractal Dimension proved effective in capturing emotional brain dynamics•XGBoost outperformed other classifiers in both DEAP and SEED datasets•The proposed method demonstrates robustness across subject variations and datasets.
- Research Article
28
- 10.3390/diagnostics13162624
- Aug 8, 2023
- Diagnostics
EEG-based emotion recognition has numerous real-world applications in fields such as affective computing, human-computer interaction, and mental health monitoring. This offers the potential for developing IOT-based, emotion-aware systems and personalized interventions using real-time EEG data. This study focused on unique EEG channel selection and feature selection methods to remove unnecessary data from high-quality features. This helped improve the overall efficiency of a deep learning model in terms of memory, time, and accuracy. Moreover, this work utilized a lightweight deep learning method, specifically one-dimensional convolutional neural networks (1D-CNN), to analyze EEG signals and classify emotional states. By capturing intricate patterns and relationships within the data, the 1D-CNN model accurately distinguished between emotional states (HV/LV and HA/LA). Moreover, an efficient method for data augmentation was used to increase the sample size and observe the performance deep learning model using additional data. The study conducted EEG-based emotion recognition tests on SEED, DEAP, and MAHNOB-HCI datasets. Consequently, this approach achieved mean accuracies of 97.6, 95.3, and 89.0 on MAHNOB-HCI, SEED, and DEAP datasets, respectively. The results have demonstrated significant potential for the implementation of a cost-effective IoT device to collect EEG signals, thereby enhancing the feasibility and applicability of the data.
- Research Article
- 10.1016/j.neunet.2025.108413
- May 1, 2026
- Neural networks : the official journal of the International Neural Network Society
ELAI-SGCN: An explainable lightweight adaptive information-perceiving spiking graph convolutional network for EEG-based emotion recognition.
- Research Article
935
- 10.1109/taffc.2017.2712143
- Jul 1, 2019
- IEEE Transactions on Affective Computing
In this paper, we investigate stable patterns of electroencephalogram (EEG) over time for emotion recognition using a machine learning approach. Up to now, various findings of activated patterns associated with different emotions have been reported. However, their stability over time has not been fully investigated yet. In this paper, we focus on identifying EEG stability in emotion recognition. We systematically evaluate the performance of various popular feature extraction, feature selection, feature smoothing and pattern classification methods with the DEAP dataset and a newly developed dataset called SEED for this study. Discriminative Graph regularized Extreme Learning Machine with differential entropy features achieves the best average accuracies of 69.67 and 91.07 percent on the DEAP and SEED datasets, respectively. The experimental results indicate that stable patterns exhibit consistency across sessions; the lateral temporal areas activate more for positive emotions than negative emotions in beta and gamma bands; the neural patterns of neutral emotions have higher alpha responses at parietal and occipital sites; and for negative emotions, the neural patterns have significant higher delta responses at parietal and occipital sites and higher gamma responses at prefrontal sites. The performance of our emotion recognition models shows that the neural patterns are relatively stable within and between sessions.
- Book Chapter
2
- 10.1007/978-981-99-0248-4_27
- Jan 1, 2023
EEG signals are the modality that is widely used to recognize human emotions. However, the limited data on EEG signals remains challenging because of the small recording participants, the need for an expert to interpret EEG signals, and the expensive cost of tools to record EEG signals. This research proposed the data augmentation schemes on the EEG datasets to overcome the limited available data problem. Augmenting the data will help the generalizability of the emotion recognition model. The EEG signals on the DEAP and SEED datasets are transformed into image samples using a recurrence plot and spectrogram. Then, the artificial recurrence plot and the artificial spectrogram samples are generated using Pix2pix. This research used these artificial samples to conduct the data augmentation process. LeNet5, ResNet50, MobileNet, and DenseNet121 are used to conduct the classification. The best four data augmentation schemes are as follows: Appending 20,000 artificial recurrence plot samples to DEAP and SEED training datasets, appending 20,000 artificial spectrogram samples to the DEAP training dataset, and appending 15,000 artificial spectrogram samples to the SEED training dataset. The kappa coefficient for each classification model based on the best data augmentation schemes is computed. It is found that among the compared classifiers, LeNet5 achieved the best accuracy in both SEED (98.58%) and DEAP (86.12%) datasets when spectrogram was used. Therefore, LeNet5 trained on the spectrogram samples is a reliable and robust classification model. This finding implies that the use of spectrogram is more promising than the recurrence plot in human emotion recognition.
- Research Article
2
- 10.46300/9106.2021.15.46
- Apr 29, 2021
- International Journal of Circuits, Systems and Signal Processing
A Brain-computer interface (BCI) using an electroencephalogram (EEG) signal has a great attraction in emotion recognition studies due to its resistance to humans’ deceptive actions. This is the most significant advantage of brain signals over speech or visual signals in the emotion recognition context. A major challenge in EEG-based emotion recognition is that a lot of effort is required for manually feature extractor, EEG recordings show varying distributions for different people and the same person at different time instances. The Poor generalization ability of the network model as well as low robustness of the recognition system. Improving algorithms and machine learning technology helps researchers to recognize emotion easily. In recent years, deep learning (DL) techniques, specifically convolutional neural networks (CNNs) have made excellent progress in many applications. This study aims to reduce the manual effort on features extraction and improve the EEG signal single model’s emotion recognition using convolutional neural network (CNN) architecture with residue block. The dataset is shuffle, divided into training and testing, and then fed to the model. DEAP dataset has class 1, class 2, class 3, and class 4 for both valence and arousal with an accuracy of 90.69%, 91.21%, 89.66%, 93.64% respectively, with a mean accuracy of 91.3%. The negative emotion has the highest accuracy of 94.86% fellow by neutral emotion with 94.29% and positive emotion with 93.25% respectively, with a mean accuracy of 94.13% on the SEED dataset. The experimental results indicated that CNN Based on residual networks can achieve an excellent result with high recognition accuracy, which is superior to most recent approaches.