MFIR: Multimodal fusion and inconsistency reasoning for explainable fake news detection

Yanning Zhang,Zhen Wang,Yuzhou Long,Lianwei Wu,Chao Gao

doi:10.1016/j.inffus.2023.101944

Abstract

Fake news possesses a destructive and negative impact on our lives. With the rapid growth of multimodal content in social media communities, multimodal fake news detection has received increasing attention. Most existing approaches focus on learning the respective deep semantics of various modalities and integrating them by traditional fusion modes (like concatenation or addition, etc.) for improving detection performance, which has achieved a certain degree of success. However, they have two crucial issues: (1) Shallow cross-modal feature fusion, and (2) Difficulty in capturing inconsistent information. To this end, we propose Multimodal Fusion and Inconsistency Reasoning (MFIR) model to discover multimodal inconsistent semantics for explainable fake news detection. Specifically, MFIR consists of three modules: (1) Different from the traditional fusion modes, cross-modal infiltration fusion is designed, which is absorbed in continuously infiltrating and correlating another modality features into its internal semantics based on the current modality, which can well ensure the retention of the contextual semantics of the original modality; (2) Multimodal inconsistent learning not only captures the local inconsistent semantics from the perspectives of text and vision, but also integrates the two types of local semantics to discover global inconsistent semantics in multimodal content; (3) To enhance the interpretability of inconsistent semantics as evidence for users, we develop explanation reasoning layer to supplement the contextual information of inconsistent semantics, resulting in more understandable evidence semantics. Extensive experiments confirm the effectiveness of our model on three datasets and improved performance by up to 2.8%.

Full Text