Abstract
News intentionally containing false information–known as "fake news"–is common on the Internet and often causes social disruption. In order to solve it, research on automatic detection of fake news using supervised learning has been active. Although the accuracy is improving, a major challenge for practical application remains: models can not work well for news in unknown fields (domains) due to domain biases. The goal of this study is to mitigate these domain biases and improve the accuracy of cross-domain fake news detection, which tests news from unknown domains. We firstly try to mitigate the bias by masking noun phrases which are considered a major source of domain bias. However, masking has not improved accuracy. Therefore, we point out that the dataset in this study has the property that it always contains pairs of fake and real news on the exact same topic. In this paper, we focus on this property of dataset and examine how it may affect domain bias and accuracy. Comparative experiments show that accuracy is higher when trained on a dataset with the property shown in this study. We suggest that a fake news dataset consisting of paired news could be effective for cross-domain detection.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.