Abstract

A vast amount of data is generated every second for microblogs, content sharing via social media sites, and social networking. Twitter is an essential popular microblog where people voice their opinions about daily issues. Recently, analyzing these opinions is the primary concern of Sentiment analysis or opinion mining. Efficiently capturing, gathering, and analyzing sentiments have been challenging for researchers. To deal with these challenges, in this research work, we propose a highly accurate approach for SA of fake news on COVID-19. The fake news dataset contains fake news on COVID-19; we started by data preprocessing (replace the missing value, noise removal, tokenization, and stemming). We applied a semantic model with term frequency and inverse document frequency weighting for data representation. In the measuring and evaluation step, we applied eight machine-learning algorithms such as Naive Bayesian, Adaboost, K-nearest neighbors, random forest, logistic regression, decision tree, neural networks, and support vector machine and four deep learning CNN, LSTM, RNN, and GRU. Afterward, based on the results, we boiled a highly efficient prediction model with python, and we trained and evaluated the classification model according to the performance measures (confusion matrix, classification rate, true positives rate...), then tested the model on a set of unclassified fake news on COVID-19, to predict the sentiment class of each fake news on COVID-19. Obtained results demonstrate a high accuracy compared to the other models. Finally, a set of recommendations is provided with future directions for this research to help researchers select an efficient sentiment analysis model on Twitter data.

Highlights

  • NLP a specified area of research which deals with the phenomena how computers can take part in understanding and manipulating human language to perform useful operations

  • Dataset used in this research work is titled as “COVID Fake News Dataset” developed by (Sumit Banik, 2020) and published on Coronavirus Disease Research Community-Covid-19

  • From the comparison of the different measures, we find that BiLSTM and Convolutional Neural Network (CNN) perform better than other learning methods even though machine learning algorithms give a good accuracy, but CNN and BiLSTM are the most efficient because they gave a very high accuracy

Read more

Summary

Introduction

NLP a specified area of research which deals with the phenomena how computers can take part in understanding and manipulating human language (text and speech) to perform useful operations. It is an area in which after analyzing the data, proposed model can grab relevant or useful data using context and input can be represented in a different way [1]. AI technology is very popular for smart homes, smart industries, smart transportation, smart healthcare, smart cities, and satellite It comprises many IoT devices (Things) that are equipped with different sensors, actuators, storage, computational, and communicational capabilities to collect and exchange the data over traditional internet. Natural language processing allows the system to perform operations on natural human language and translates it to machine understandable format [3]. [4] states in their

Objectives
Methods
Findings
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.