Abstract

Sentiment analysis has broad application prospects in the field of social opinion mining. The openness and invisibility of the internet makes users’ expression styles more diverse and thus results in the blooming of complicated contexts in which different unimodal data have inconsistent sentiment tendencies. However, most sentiment analysis algorithms only focus on designing multimodal fusion methods without preserving the individual semantics of each unimodal data. To avoid misunderstandings caused by ambiguity and sarcasm in complicated contexts, we propose a multimodal mutual attention-based sentiment analysis (MMSA) framework adapted to complicated contexts, which consists of three levels of subtasks to preserve the unimodal unique semantics and enhance the common semantics, to mine the association between unique semantics and common semantics and to balance decisions from unique and common semantics. In the framework, a multiperspective and hierarchical fusion (MHF) module is developed to fully fuse multimodal data, in which different modalities are mutually constrained and the fusion order is adjusted in the next step to enhance cross-modal complementarity. To balance the data, we calculate the loss by applying different weights to positive and negative samples. The experimental results on the CH-SIMS multimodal dataset show that our method outperforms existing multimodal sentiment analysis algorithms.The code of this work is available at https://gitee.com/viviziqing/mmsacode.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call