AtM-DNN: A Multimodal Attention Fusion Network with Auxiliary Function for Sentiment Classification

Ji Huang,Xiaolong Xu

doi:10.1109/cscwd54268.2022.9776143

Abstract

Multimodal sentiment classification is an important research attracting many scientists’ attention in natural language processing. In most multimodal sentiment research, each modal of the dataset is labeled with a unified label. However, this unified label of multimodal data may limit the model to obtain the different information between multimodal in the training process. To address the above issues, this paper proposes AtM-DNN, a model based on multimodal attention fusion network in collaboration with auxiliary function for multimodal sentiment classification. Different single-modal encoders are used to map the original data of different modal into vectors of different information dimensions. Attention mechanism is introduced to control the information weight of each modal in the interaction, so as to remove the redundancy of information. Besides, each modal is introduced into the loss function as an auxiliary function to assist the calculation of the final loss function. Experiments on CH-SIMS dataset show that the attention mechanism of each modal can exactly improve the accuracy and F1 of sentiment classification of each single-modal, and also can improve the accuracy and F1 of multimodal sentiment classification. Although the collaboration of auxiliary function can improve the effect of multimodal sentiment classification, it is likely to reduce the effect of each single-modal. And these models are better than the single-modal sentiment classification model.

Full Text