STP-MFM: Semi-tensor product-based multi-modal factorized multilinear pooling for information fusion in sentiment analysis

Fen Liu,Jianfeng Chen,Kemeng Li,Jisheng Bai,Weijie Tan,Chang Cai,Muhammad Saad Ayub

doi:10.1016/j.dsp.2023.104265

Fen Liu, Jianfeng Chen + Show 5 more

https://doi.org/10.1016/j.dsp.2023.104265

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Multi-modal fusion can exploit complementary information from various modalities and improve the accuracy of prediction or classification tasks. In this paper, we propose a semi-tensor product-based multi-modal factorized multilinear (STP-MFM) pooling method for information fusion in sentiment analysis. Initially, we extend the bilinear pooling to multilinear pooling for multi-modal fusion. Next, we propose a multi-modal factorized multilinear pooling (MFM) method, which parametrizes the fusion weight tensor with the Tucker decomposition. Furthermore, we propose to use Semi-Tensor Product (STP) in MFM to obtain more flexible and compact tensor decompositions with smaller factor sizes, this process permits the connection of two factors with different dimensionality by using the semi-tensor mode product. The proposed method removes the limitation of dimension consistency in matrix multiplication and expresses the information in a more compact structure with less memory. Most importantly, the STP leverages temporal and spatial information from video, audio, and text, producing a better representation of intra-modality correlations. We verified the proposed STP-MFM for sentiment analysis on the CMU-MOSI and the IEMOCAP datasets. The experimental results indicate that the proposed method outperforms the baselines by a significant margin. Moreover, it also gains a superior training speed and lowers model complexity.

Full Text