Dynamic Weighted Multitask Learning and Contrastive Learning for Multimodal Sentiment Analysis

Xingqi Wang,Yanli Shao,Bin Chen,Mengrui Zhang,Dan Wei

doi:10.3390/electronics12132986

Abstract

Multimodal sentiment analysis (MSA) has attracted more and more attention in recent years. This paper focuses on the representation learning of multimodal data to reach higher prediction results. We propose a model to assist in learning modality representations with multitask learning and contrastive learning. In addition, our approach obtains dynamic weights by considering the homoscedastic uncertainty of each task in multitask learning. Specially, we design two groups of subtasks, which predict the sentiment polarity of unimodal and bimodal representations, to assist in learning representation through a hard parameter-sharing mechanism in the upstream neural network. A loss weight is learned according to the homoscedastic uncertainty of each task. Moreover, a training strategy based on contrastive learning is designed to balance the inconsistency between training and inference caused by the randomness of the dropout layer. This method minimizes the MSE between two submodels. Experimental results on the MOSI and MOSEI datasets show our method achieves better performance than the current state-of-the-art methods by comprehensively considering the intramodality and intermodality interaction information.

Full Text