Abstract

Many existing image and text sentiment analysis methods only consider the interaction between image and text modalities, while ignoring the inconsistency and correlation of image and text data, to address this issue, an image and text aspect level multimodal sentiment analysis model using transformer and multi-layer attention interaction is proposed. Firstly, ResNet50 is used to extract image features, and RoBERTa-BiLSTM is used to extract text and aspect level features. Then, through the aspect direct interaction mechanism and deep attention interaction mechanism, multi-level fusion of aspect information and graphic information is carried out to remove text and images unrelated to the given aspect. The emotional representations of text data, image data, and aspect type sentiments are concatenated, fused, and fully connected. Finally, the designed sentiment classifier is used to achieve sentiment analysis in terms of images and texts. This effectively has improved the performance of sentiment discrimination in terms of graphics and text.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.