Fake news on social media has become a social problem. Fake news refers to false information that is deliberately intended to deceive people. Several studies have been conducted on automatic detection systems that reduce the damage caused by fake news. However, most studies address the improvements made in detection accuracy, and real-world operations are rarely discussed. As the contents and expressions of fake news change over time, a model with a high detection accuracy loses accuracy after a few years. This phenomenon is called concept drift. As most conventional methods employ word representations, these methods exhibit accuracy degradation resulting from changes in word fads and usage. However, methods using the sentiment information of words can identify inflammatory sentences, which is a characteristic of fake news, and may suppress performance degradation caused by concept drift. In this study, a model using vector representations obtained from an emotion dictionary was compared with a model using conventional word embedding. Subsequently, we verified the resistance of the model to performance degradation. The results revealed the method using sentiment representation is less susceptible to concept drift. Models and learning methods that can achieve both detection accuracy and resistance to accuracy degradation can enable further development of fake news detection systems.
Read full abstract