Recent studies have found that electroencephalographic (EEG) features from different frequency bands and different brain regions contribute differently to emotion recognition (ER) in virtual reality (VR) and two-dimensional (2D) scenes. Despite achievements in the task of ER in both VR and 2D environments, there has not been much prior effort to develop a unified ER model that can be applied to both VR and 2D environments. We propose a novel Frequency-Band-Spatial-based Attention Network, named FBSA-Net, for ER in VR and 2D scenes. Specifically, FBSA-Net adaptively captures the frequency-band-spatial relationship of EEG signals through cascade fusion of frequency-band attention (FBA) module and spatial attention (SA) module. The FBA module can automatically assign FBA weights to each band feature. The SA module then applies the ProbeParse attention mechanism to adaptively explore channel relationships both within and across brain regions, thus facilitating a comprehensive understanding of the global features of the EEG signals. The model's efficacy was validated through the VRSDEED dataset, created from emotion induction experiments using VR and 2D induction programs. Multiple comparative experiments have shown that FBSA-Net achieved state-of-the-art recognition results on VRSDEED datasets, achieving average recognition accuracies of 96.63% and 96.49% in VR and 2D scenes, respectively. The research results indicate that the significant contribution of FBSA-Net is not only reflected in the ability to provide ER using the same model in VR and 2D environments, but also in its excellent ability to distinguish between the emotional states in these two scenarios, providing a valuable reference for using the same model for ER.
Read full abstract