Comparison of Machine Learning Approaches for Detecting COVID-19-Lockdown-Related Discussions During Recovery and Lockdown Periods

A.H Alamoodi,Ibrahim Alshakhatreh, ,A.S Albahri,O.S Albahri,Mohammed Rashad Baker,Moceheb Lazam Shuwandy,Victorian Institute Of Technology, Australia ,Amneh Alamleh,Salem Garfan

doi:10.31181/jopi1120233

A.H Alamoodi, Ibrahim Alshakhatreh + Show 8 more

Open Access

https://doi.org/10.31181/jopi1120233

Copy DOI

Journal: Journal of Operations Intelligence	Publication Date: Oct 25, 2023
Citations: 2	License type: cc-by

Abstract

Ever since COVID-19 was declared a pandemic, governments around the world have implemented numerous phases of lockdown measures to curb the spread of the virus. These lockdown tactics manifest themselves in the form of widespread fear and panic driven by social media discussions. Given that individuals hold diverse opinions about these lockdown measures during and after their completion, positive and negative lockdown-related discussions should be differentiated to further understand the major related issues and to make appropriate messaging and policy choices in the future. We conduct a sentiment analysis (SA) of COVID-19-lockdown-related tweets by using different machine learning (ML) classifiers and then evaluate their performance before and after using the synthetic minority oversampling technique (SMOTE). This research is performed in five phases, starting with data collection and followed by pre-processing the dataset, preparing the dataset by annotation, applying SMOTE and using ML classifiers. We observe an improvement in accuracy ( ) as confirmed by the Matthew correlation coefficient ( ) across most classifiers, except for the k-nearest neighbour (KNN), whose Acc decreased from 0.82 to 0.59 and MCC decreased from 0.544 to 0.279 before and after SMOTE was applied. Despite the potential of SMOTE with some classifiers, this technique cannot be considered an ultimate solution, especially with other classifiers and datasets. The study provides insights into the need to evaluate and benchmark the integration of data balancing approaches with ML classifiers in addition to considering additional metrics, such as MCC, for binary classification problems, especially in SA.

Full Text