Abstract

Sentiment analysis has recently become increasingly important with a massive increase in online content. It is associated with the analysis of textual data generated by social media that can be easily accessed, obtained, and analyzed. With the emergence of COVID-19, most published studies related to COVID-19’s conspiracy theories were surveys on the people's sentiments and opinions and studied the impact of the pandemic on their lives. Just a few studies utilized sentiment analysis of social media using a machine learning approach. These studies focused more on sentiment analysis of Twitter tweets in the English language and did not pay more attention to other languages such as Arabic. This study proposes a machine learning model to analyze the Arabic tweets from Twitter. In this model, we apply Word2Vec for word embedding which formed the main source of features. Two pretrained continuous bag-of-words (CBOW) models are investigated, and Naïve Bayes was used as a baseline classifier. Several single-based and ensemble-based machine learning classifiers have been used with and without SMOTE (synthetic minority oversampling technique). The experimental results show that applying word embedding with an ensemble and SMOTE achieved good improvement on average of F1 score compared to the baseline classifier and other classifiers (single-based and ensemble-based) without SMOTE.

Highlights

  • With the emergence of the coronavirus (COVID-19) disease in December 2019 in the Chinese city of Wuhan and the speed of its outbreak in most countries over the world, much talk has begun on social media about the causes of the emergence of the virus, its symptoms, ways to prevent it, and the efforts made by the developed countries and global research centers to find out the drug to recover from this virus

  • Since there is no scientific evidence from governmental sources or the World Health Organization to show the true reasons for the emergence of the COVID-19 and its outbreak, people turn to other sources such as social media to obtain information, exchange ideas and experiences, and discuss various issues related to this disease

  • Most studies use the Twitter and Facebook platforms to analyze posts written in the English language

Read more

Summary

Introduction

With the emergence of the coronavirus (COVID-19) disease in December 2019 in the Chinese city of Wuhan and the speed of its outbreak in most countries over the world, much talk has begun on social media about the causes of the emergence of the virus, its symptoms, ways to prevent it, and the efforts made by the developed countries and global research centers to find out the drug to recover from this virus. With the emergence of COVID-19, analysts began analyzing textual data in social media such as Twitter and Facebook to study the trends and opinions of the public on various topics, such as people’s opinions about the conspiracy theory surrounding the emergence and spread of this disease. E primary objective of this study is to analyze Arab people’s attitudes towards COVID-19 conspiracy theories through sentiment analysis of Twitter data. Is study was carried out on Twitter tweets using sentiment classification with different machine learning algorithms for the analysis and visualization of the opinions regarding the COVID-19related conspiracy theories such as COVID-19 made in a Chinese lab, COVID-19 related to 5G network, and Bill Gates supported the development of a global surveillance system by spreading vaccination widely using the COVID19 pandemic.

Related Work
Experiments and Results
74 Without SMOTENC With SMOTENC Model
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.