Abstract

In social media, images and texts are used to convey individuals’ attitudes and feelings; thus, social media has become an indispensable part of people’s lives. To understand social behavior and provide better recommendations, sentiment analysis on social media is helpful. One sentiment analysis task is polarity prediction. Although current research on visual or textual sentiment analysis has achieved quite good progress, multimodal and cross-modal analysis combining visual and textual correlation is still in the exploration stage. To capture a semantic connection between images and captions, this paper proposes a cross-modal approach that considers both images and captions in classifying image sentiment polarity. This method transfers the correlation between textual content to images. First, the image and its corresponding caption are sent into an inner-class mapping model, where they are transformed into vectors in Hilbert space to obtain their labels by calculating the inner-class maximum mean discrepancy (MMD). Then, a class-aware sentence representation (CASR) model assigns the distributed representation to the labels with a class-aware attention-based gated recurrent unit (GRU). Finally, an inner-class dependency LSTM (IDLSTM) classifies the sentiment polarity. Experiments carried out on the Getty Images dataset and Twitter 1269 dataset demonstrate the effectiveness of our approach. Moreover, extensive experimental results show that our model outperforms baseline solutions.

Highlights

  • As social media thrives, analyzing the sentiments in tweets has attracted increasing attention from researchers

  • To increase the understanding of image sentiment by exploring the correlation between visual content and textual context, this paper proposes a novel cross-modal model for image sentiment analysis

  • (2) Different from the existing cross-modal sentiment analysis methods, this paper proposes an inner-class mapping method based on unsupervised maximum mean discrepancy (MMD), which attempts to learn cross-modal mapping correlations between images and descriptions

Read more

Summary

INTRODUCTION

As social media thrives, analyzing the sentiments in tweets has attracted increasing attention from researchers. To increase the understanding of image sentiment by exploring the correlation between visual content and textual context, this paper proposes a novel cross-modal model for image sentiment analysis. The main contributions of this paper include the following: (1) In this paper, a novel cross-modal image sentiment analysis model is proposed This model extracts visual features and uses them as the attention weight parameter of LSTM to obtain the context image related in the corresponding textual description (caption). This model can be used to predict an image sentimental polarity by utilizing semantic correlation descriptions.

RELATED WORK
VISUAL AND TEXTUAL FEATURE EXTRACTION
JOINT MAPPING MODEL
EXPERIMENTS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.