Abstract

Multimodal sentiment analysis refers to the use of computers to analyze and identify the emotions that people want to express through the extracted multimodal sentiment features, and it plays a significant role in human-computer interaction and financial market prediction. Most existing approaches to multimodal sentiment analysis use contextual information for modeling, and while this modeling approach can effectively capture the contextual connections within modalities, the correlations between modalities are often overlooked, and the correlations between modalities are also critical to the final recognition results in multimodal sentiment analysis. Therefore, this paper proposes a multimodal sentiment analysis approach based on the universal transformer, a framework that uses the universal transformer to model the connections between multiple modalities while employing effective feature extraction methods to capture the contextual connections of individual modalities. We evaluated our proposed method on two benchmark datasets for multimodal sentiment analysis, CMU-MOSI and CMU-MOSEI, and the results outperformed other methods of the same type.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call