Abstract

Using multimodal fusion method to deal with emotion recognition task has become a trend. The fusion vector can more comprehensively reflect the subject’s emotional change state, so as to obtain a more accurate emotion recognition effect. However, different fusion input or feature fusion methods have different effects on the final fusion results. In this paper, we propose a subjective and objective feature fused neural network model (SOFNN) for emotion recognition, which can effectively learn spatial–temporal information from EEG signals and dynamically integrate EEG signals with eye movement signals. Specifically, we extract more abundant spatial and temporal information from the original EEG signal through a series of 1-D convolution kernels of different sizes and we verify the effectiveness of the extracted features through experiments. The size of the 1-D convolution kernels is determined by the characteristics (such as sampling rate and number of channels) of the original EEG signal. Then, we design a subjective and objective feature fusion framework to adjust the proportion of the two features through the dynamic learning of the weight vector, so as to fully exploit their respective advantages. We evaluate the performance of our model on the SEED-IV dataset, which is a common dataset. For the recognition task of four emotions (happy, sad, fear and neutral), our model achieves 86.27% accuracy and 10.16% standard deviation, which are better than the existing methods. In addition, we design a variety of ablation experiments to verify the effectiveness of each module in our model. The experiment results show that our model can make better use of the complementary relationship between subjective and objective features, which can achieve better emotion recognition effect.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call