Abstract

<p>In recent years, more and more people express their feelings through both images and texts, boosting the growth of multimodal data. Multimodal data contains richer semantics and is more conducive to judging the real emotions of people. To fully learn the features of every single modality and integrate modal information, this paper proposes a fine-grained multimodal sentiment analysis method FCLAG based on gating and attention mechanism. First, the method is carried out from the character level and the word level in the text aspect. CNN is used to extract more fine-grained emotional information from characters, and the attention mechanism is used to improve the expressiveness of the keywords. In terms of images, a gating mechanism is added to control the flow of image information between networks. The images and text vectors represent the original data collectively. Then the bidirectional LSTM is used to complete further learning, which enhances the information interaction capability between the modalities. Finally, put the multimodal feature expression into the classifier. This method is verified on a self-built image and text dataset. The experimental results show that compared with other sentiment classification models, this method has greater improvement in accuracy and F1 score and it can effectively improve the performance of multimodal sentiment analysis.</p>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call