Abstract

Since multimodal learning is able to take advantage of the complementarity of multimodal signals, the performance of multimodal emotion recognition usually surpasses that based on a single modality. In this paper, we introduce deep generalized canonical correlation analysis with an attention mechanism (DGCCA-AM) to multimodal emotion recognition. This model extends the conventional canonical correlation analysis (CCA) from two modalities to arbitrarily numerous modalities and implements multimodal adaptive fusion with an attention mechanism. By adjusting the weights matrices to maximize the generalized correlation of different modalities, DGCCA-AM extracts emotion-related information from multiple modalities and discards noises. The attention mechanism allows a neural network to learn adaptive fusion weights for different modalities and produces a more effective multimodal fusion and superior emotion recognition performance. We evaluate DGCCA-AM on a public multimodal dataset, SEED-V. Our experimental results demonstrate that DGCCA-AM achieves a state-of-the-art mean accuracy of 82.11% and standard deviation of 2.76% for five emotion classifications with three modalities.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.