Abstract

The extensive use of online information platforms over traditional news media has amplified the dissemination of fake news. Supervised machine learning based techniques are being extensively used in the detection of fake news in Social Media. However, the performance of such models degrades in the case of cross-domain data scenarios. In this study, we empirically show that the performance of a model depends on the domain-specific and agnostic case. To conduct this study, we extracted the tweets based on the Afghanistan crisis and developed a dataset which we call ’FakeBan’. The country has witnessed the sudden spread of misinformation where several actors are misusing it as ammunition, leading to far-flung troubling implications. We chose to study the most recent Afghanistan and experimented with three completely different domains widely involved in fake news: national crisis, healthcare, and politics. Several advanced datasets are already available in the domain of healthcare and politics. However, it takes a long time to build a labeled dataset based on a recent national crisis. We propose an adaptive fake news detection technique capable of selecting the model based on the domain (single or cross) and thus address the challenging issues arising from the voluminous and highly varied information available on social media. The results of our study affirm that in the case of domain-specific data, machine learning classifiers have performed well using a set of selected features out of twenty-one extracted features. In contrast, deep learning models, particularly the BERT model, have outperformed traditional machine learning classifiers in domain agnostic cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call