Abstract

Multimodal sentiment analysis is an extended approach to traditional language-based sentiment analysis, which uses other relevant modality data. Multimodal sentiment analysis usually applies visual, textual, and acoustic representations for sentiment prediction. Recently, various data fusion methodologies have been proposed for multimodal sentiment analysis. In most cases, textual modality plays a major role, and visual and acoustic modalities are used as auxiliary sources for multimodal sentiment analysis. However, in general multimedia such as video, text transcripts of an individual's speech are not provided. Research on an audio-visual sentiment analysis methodology that does not depend on text modality is essential for multimodal sentiment analysis in real-world industrial applications. Therefore, it is important to improve audio-visual sentiment analysis because it currently exhibits lower performance than multimodal sentiment analysis, including text modality. In this paper, we propose heterogeneous modality transfer learning (HMTL) to utilize the knowledge of aligned text data as a source modality in transfer learning to improve audio-visual sentiment analysis performance. Our approach uses a decoder and adversarial learning techniques to reduce the gap between the source and target modalities in the embedded space for multimodal representation. Our proposed methodology experimentally outperformed recent unimodal and bimodal audio-visual sentiment analysis achievements.

Highlights

  • Sentiment analysis is a series of methods for extracting an author’s emotions toward a given subject from their text [1]

  • This study aims to improve the performance of audiovisual sentiment analysis that uses only acoustic and visual information by using heterogeneous transfer learning from textual information

  • This paper proposes heterogeneous modality transfer learning to improve the performance of audio-visual sentiment analysis without text

Read more

Summary

Introduction

Sentiment analysis is a series of methods for extracting an author’s emotions toward a given subject from their text [1]. It can predict the polarity of sentiment in text by analyzing hidden information such as attitudes, opinions, and feelings in the text. Deep learning-based sentiment analysis recently has achieved outstanding results in text polarity classification [2], [3]. These sentiment analysis techniques are considered a task in natural language processing. Sentiment analysis of texts and attempts to classify speakers’ emotions in other modes of expression

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call