A link2vec-based fake news detection model using web search results

Jae-Seung Shim,Yunju Lee,Hyunchul Ahn

doi:10.1016/j.eswa.2021.115491

Abstract

Today, the world is under siege from various kinds of fake news ranging from politics to COVID-19. Thus, many scholars have been researching automatic fake news detection based on artificial intelligence and machine learning (AI/ML) to prevent the spread of fake news. The mainstream research on detecting fake news so far has been text-based detection approaches, but they have inherent limitations such as the difficulty of short text processing and language dependency. Thus, as an alternative to the text-based approach, the context-based approach is emerging. The most common context-based approach the use of distributors’ network information in social media. However, such information is difficult to obtain, and only propagation within a single social media can be traced. Under this background, we propose the use of composition pattern of web links containing news content as a new source of information for fake news detection. To properly vectorize the composition pattern of web links, this study proposes a novel embedding technique, which is called link2vec, an extension of word2vec. To test the effectiveness and language independency of our link2vec-based model, we applied it to two real-world fake news datasets in different languages (English and Korean). As comparison models, we adopted the conventional text-based model and a hybrid model that combined text and whitelist-based link information proposed by a prior study. Results revealed that in the datasets in two languages, the link2vec-based detection models outperformed all the comparison models with statistical significance. Our research is expected to contribute to suggesting a completely new path for effective fake news detection.

Full Text