Abstract
In recent years, an increasing number of people have indicated their inclination to express their feelings and opinions in the form of text and pictures on social media. Thus, the amount of multimodal data with text and pictures as the main content is increasing. By analyzing the sentiment such multimodal data, people’s attitudes and opinions can be understood. To solve the problem of information redundancy in the multimodal sentiment classification task, first, an image feature extraction model is established based on an attention neural network, which highlights the key areas of the image sentiment information. Second, the tensor product of the text and image mode is used as the joint feature expression of the multimodal data by using the tensor fusion method. After that, a specific information extraction module is designed to extract the fusion features and eliminate the redundant information in the joint features. Results of the experiments performed on two real Twitter image and text datasets demonstrate that the proposed model can outperform the existing models in classifying sentiments.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.