Learning Disentangled Representation for Multimodal Cross-Domain Sentiment Analysis.

Yuhao Zhang,Xiaojie Yuan,Ying Zhang,Wenya Guo,Xiangrui Cai

doi:10.1109/tnnls.2022.3147546

Abstract

Multimodal cross-domain sentiment analysis aims at transferring domain-invariant sentiment information across datasets to address the insufficiency of labeled data. Existing adaptation methods achieve well performance by remitting the discrepancies in characteristics of multiple modalities. However, the expressive styles of different datasets also contain domain-specific information, which hinders the adaptation performance. In this article, we propose a disentangled sentiment representation adversarial network (DiSRAN) to reduce the domain shift of expressive styles for multimodal cross-domain sentiment analysis. Specifically, we first align the multiple modalities and obtain the joint representation through a cross-modality attention layer. Then, we disentangle sentiment information from the multimodal joint representation that contains domain-specific expressive style by adversarial training. The obtained sentiment representation is domain-invariant, which can better facilitate the sentiment information transfer between different domains. Experimental results on two multimodal cross-domain sentiment analysis tasks demonstrate that the proposed method performs favorably against state-of-the-art approaches.

Full Text