Modality-invariant temporal representation learning for multimodal sentiment classification

Hao Sun,Jiaqing Liu,Yen-Wei Chen,Lanfen Lin

doi:10.1016/j.inffus.2022.10.031

Abstract

Multimodal sentiment classification is a notable research field that aims to refine sentimental information and classify the sentiment tendency from sequential multimodal data. Most existing sentimental recognition algorithms explore multimodal fusion schemes that achieve good performance. However, there are two key challenges to overcome. First, it is essential to effectively extract inter- and intra-modality features prior to fusion, while simultaneously reducing ambiguity. The second challenge is how to learn modality-invariant representations that capture the underlying similarities. In this paper, we present a modality-invariant temporal learning technique and a new gated inter-modality attention mechanism to overcome these issues. For the first challenge, our proposed gated inter-modality attention mechanism performs modality interactions and filters inconsistencies from multiple modalities in an adaptive manner. We also use parallel structures to learn more comprehensive sentimental information in pairs (i.e., acoustic and visual). In addition, to address the second problem, we treat each modality as a multivariate Gaussian distribution (considering each timestamp as a single Gaussian distribution) and use the KL divergence to capture the implicit temporal distribution-level similarities. These strategies are helpful in reducing domain shifts between different modalities and extracting effective sequential modality-invariant representations. We have conducted experiments on several public datasets (i.e., YouTube and MOUD) and the results show that our proposed method outperforms the state-of-the-art multimodal sentiment categorization methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Modality-invariant temporal representation learning for multimodal sentiment classification

Abstract

Talk to us

Similar Papers

More From: Information Fusion

Lead the way for us

Journal: Information Fusion	Publication Date: Nov 4, 2022
Citations: 13

Similar Papers

Multi-Modal Sentiment Classification With Independent and Interactive Knowledge via Semi-Supervised Learning
Dong Zhang ... Guodong Zhou
IEEE Access | VOL. 8
Dong Zhang, et. al.Dong Zhang ... Guodong Zhou
01 Jan 2020
IEEE Access | VOL. 8

MFSC: A Multimodal Aspect-Level Sentiment Classification Framework with Multi-Image Gate and Fusion Networks
Lingling Zi ... Xiangkai Pan
Electronics | VOL. 13
Lingling Zi, et. al.Lingling Zi ... Xiangkai Pan
15 Jun 2024
Electronics | VOL. 13

Multimodal Tweet Sentiment Classification Algorithm Based on Attention Mechanism
Peiyu Zou ... Shuangtao Yang
-
Peiyu Zou, et. al.Peiyu Zou ... Shuangtao Yang
01 Jan 2019
01 Jan 2019

MIT-FRNet: Modality-invariant temporal representation learning-based feature reconstruction network for missing modalities
Jiayao Li ... Rui Zhu
Expert Systems With Applications | VOL. 249
Jiayao Li, et. al.Jiayao Li ... Rui Zhu
11 Mar 2024
Expert Systems With Applications | VOL. 249

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modality-invariant temporal representation learning for multimodal sentiment classification

Abstract

Talk to us

Similar Papers

More From: Information Fusion