Abstract

In December 2019, a highly contagious disease, Coronavirus disease 2019 (COVID-19) was first detected in Wuhan, China. The disease has spread to 212 countries and territories worldwide. While this epidemic has continued to infect millions of people, several nations have resorted to complete lockdowns. People took social networks during this shutdown to share their opinions, feelings, and find a way to calm down. This study proposed a US-based sentiment analysis of the tweets using machine learning and the lexicon analysis approach. This US-based tweets dataset was collected by RStudio software from 30 January 2020 to 10th May 2020, contains 11858 tweets. We find the label corresponding to each tweet using TextBlob, that is to say, positive, negative, or neutral. To clean up the facts we pre-process the tweets. In a later step, different feature techniques such as bag-of-words (BoW) and term frequency-inverse document frequency (TF-IDF) are used to preserve expressive information. Finally, the random forest, gradient boosting machine, extra tree classifier, logistic regression, and support vector machine models are used to categorize beliefs as being positive, negative, or neutral. Our suggested pipeline output is assessed using accuracy, precision, recall & F <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> score. This research study shows how TF-IDF features can increase the performance of the supervised machine learning models and in this work, the gradient boosting machine outperforms the others and achieves high accuracy of 96% when paired with TF-IDF features. This analysis was done to analyze how the situation is being handled by citizens of the United States. The results of the experiments validate the approach’s effectiveness.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.