Multi-image fusion multi-modal rumor detection

Huan Liu,Jinhui Li,Wenzhaoting Hu

doi:10.1109/tocs56154.2022.10016048

Abstract

With the development of mobile Internet, social media is becoming more and more important in people’s lives. It facilitates people’s lives and promotes the spread of rumors. The proliferation of rumors seriously affects social stability and people’s lives. It is necessary to quickly and automatically detect rumors, and rumor detection machine learning emerges as the times require. However, social media rumors contain multimodal information such as text and image. The model that relies solely on text information for rumor detection is still insufficient. How to effectively use multimodal information in social media has become an important issue to be solved. The existing work has proved that image help to improve the performance of rumor detection, but the existing work still has the problems of insufficient utilization of image information and insufficient multimodal understanding caused by too small data set. Multi-image fusion multi-modal rumor detection (MIF-MRD) method uses text and multi-image in social media to detect rumors. In order to better solve the problem of insufficient multi-modal understanding caused by too small data set in rumor detection, we introduce the pre-trained visual language model to obtain better visual and text feature encoders by finetuned the visual and text encoders. In order to solve the problem of limited information provided by a single image in the rumor detection of social media, we add multiple images into the model, and use the fine-tuned text and visual feature encoder to extract text and multi-image features, and design a multimodal rumor detection model based on multi-image fusion. Through comparative experiments, our method exceeds the best baseline model in multiple evaluation indexes, and the accuracy is improved by 6.8 %, which verifies the superiority of our method.

Full Text