AbstractDuring the COVID‐19 pandemic, online social networks are extensively utilized, more than ever before by 8.4%, resulting in the propagation of false information related to COVID‐19. Despite the existence of many fake news detection models; annotation inconsistency, memory consumption, accurate and self‐trained efficient algorithms for detecting the emerging COVID‐19 misinformation tweets are still challenging. Hence, the main aim of this work is to come up with a self‐trained semi‐supervised model that accurately and automatically detects the reliability of emerging COVID‐19 tweets without delay. In this work, COVID‐19 tweet dataset is created in English Language from the period January 2020 to January 2022 as a ground truth database. Then self‐trained semi‐supervised hybrid deep learning model is proposed to train both supervised and unsupervised components simultaneously using the created dataset. The proposed model is self‐trained repeatedly and the model gets updated to predict the reliability of upcoming COVID‐19 tweets that differ from training tweets. We performed experiments multiple times by limiting the percentage amount of labelled tweets shown to the model, namely 80%, 50%, 40%, 30%, 20% and 10% labelled tweets, respectively. Experimental results show that the proposed model achieves 80.92% accuracy and 98.15% accuracy in the 10% and 80% label‐seen experiments, respectively. This shows a clear rising trend in the performance curve. Therefore, this technique will be useful for effectively classifying voluminous amounts of emerging tweets generated as part of the COVID‐19 infodemic. The proposed model may efficiently use a huge amount of unlabelled tweets and enhance the model's generalization performance.
Read full abstract