Abstract

Due to its extensive use in both public and commercial contexts, sentiment analysis on Twitter has recently received much attention, particularly concerning tweets about COVID-19. Information about COVID-19 has been widely spread over social media, resulting in various views, opinions, and emotions about this pandemic, significantly impacting people's health. It is exceedingly challenging for the authorities to find these rumors on these public platforms manually. This paper proposes a framework for text classification using the RNN model and its updates, such as LSTM, BiLSTM, and GRU. This study aims to determine the best recurrent network model for handling cases of Twitter data classification. We utilized Twitter data relevant to COVID-19 and the lockdown with four classification classes (sad, joy, fear, and anger). In addition, this study aims to prove whether GloVe pre-trained word embedding can increase the accuracy of model predictions. The training and testing datasets were split into 80% and 20%, respectively. Therefore, in this experiment an early stopping technique was used with a limit of 15 epochs and a minimum delta of 0.01, meaning that training will be stopped if there is no improvement of 0.1% accuracy after 15 epochs. We used the f1-score average to measure the accuracy of the classification task results. The test results show that the BiLSTM model with GloVe word embedding yields the best f1-score compared to other models. Moreover, in all model testing, the f1-score value of the 'fear' class displays the highest accuracy compared to other classes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call