Abstract
Web users are progressively connecting during the pandemic of Covid-19. It causes the social web to grow exponentially by the huge amount of collective information. For example, Twitter, which has been growing very fast as one of the most popular social networking websites. The platform enables tracking mental health surveillance via online by using text classification methods. Latest text classification research showed that tweets can be classified accurately by using word embedding combined with the K-means algorithm. Word embedding is a way for representing words into numbers, so that the word representation can be further fed into the clustering algorithm. However, given the number of choices of word embedding models (Word2Vec, ELMo, and BERT), it raises the question of which type of word embedding has the best performance for text classification tasks. Many kinds of thoughts are spread through Twitter especially which are related to anxiety during the pandemic. This study aims to determine the most accurate web embedding methods in classifying tweets related to Covid pandemic anxiety into a more specific cluster. Each cluster is evaluated whether it has relation to the feeling of loneliness. To analyze the performance of the classification, each model is judged for their quality in which the representation method gets the best quality of clusters. Lastly, three word embedding methods are compared in terms of performance using confusion matrix (precision, recall, F1, and accuracy).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Artificial Intelligence, Machine Learning, and Mental Health in Pandemics
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.