Abstract
Due to the explosive rise of multimodal content in online social communities, cross-modal learning is crucial for accurate fake news detection. However, current multimodal fake news detection techniques face challenges in extracting features from multiple modalities and fusing cross-modal information, failing to fully exploit the correlations and complementarities between different modalities. To address these issues, this paper proposes a fake news detection model based on a one-dimensional CCNet (1D-CCNet) attention mechanism, named BTCM. This method first utilizes BERT and BLIP-2 encoders to extract text and image features. Then, it employs the proposed 1D-CCNet attention mechanism module to process the input text and image sequences, enhancing the important aspects of the bimodal features. Meanwhile, this paper uses the pre-trained BLIP-2 model for object detection in images, generating image descriptions and augmenting text data to enhance the dataset. This operation aims to further strengthen the correlations between different modalities. Finally, this paper proposes a heterogeneous cross-feature fusion method (HCFFM) to integrate image and text features. Comparative experiments were conducted on three public datasets: Twitter, Weibo, and Gossipcop. The results show that the proposed model achieved excellent performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.