Deep Learning Based Data Fusion Methods for Multimodal Emotion Recognition

Judithnkechinyere Njoku,Angelac Caliwag,Wansu Lim,Jin-Woo Jeong,Sangho Kim,Han-Jeong Hwang

doi:10.7840/kics.2022.47.1.79

Abstract

Multimodal emotion recognition is a robust and reliable method as it utilizes multimodal data for more comprehensive representation of emotions. Data fusion is a key step in multimodal emotion recognition, because the accuracy of the recognition model mostly depends on how the different modalities are combined. The goal of this paper is to compare the performances of deep learning (DL) based models for the task of data fusion and multimodal emotion recognition. The contributions of this paper are two folds: 1) We introduce three DL models for multimodal fusion and classification: early fusion, hybrid fusion, and multi-task learning. 2) We systematically compare the performance of these models on three multimodal datasets. Our experimental results demonstrate that multi-task learning achieves the best results across all modalities; 75.41%, 68.33%, and 78.75% for classification of three emotional states using the combinations of audio-visual, EEG-audio, and EEG-visual data, respectively.

Full Text