Classification Performance Comparison of BERT and IndoBERT on SelfReport of COVID-19 Status on Social Media

Irwan Budiman,Muhammad Itqan Mazdadi,Friska Abadi,Andi Farmadi,Triando Hamonangan Saragih,Astina Faridhah,Mohammad Reza Faisal

doi:10.35784/jcsi.5564

Abstract

Messages shared on social media platforms like X are automatically categorized into two groups: those who self-report COVID-19 status and those who do not. However, it is essential to note that these messages cannot be a reliable monitoring tool for tracking the spread of the COVID-19 pandemic. The classification of social media messages can be achieved through the application of classification algorithms. Many deep learning-based algorithms, such as Convolutional Neural Networks (CNN) or Long Short-Term Memory (LSTM), have been used for text classification. However, CNN has limitations in understanding global context, while LSTM focuses more on understanding word-by-word sequences. Apart from that, both require a lot of data to learn. Currently, an algorithm is being developed for text classification that can cover the shortcomings of the previous algorithm, namely Bidirectional Encoder Representations from Transformers (BERT). Currently, there are many variants of BERT development. The primary objective of this study was to compare the effectiveness of two classification models, namely BERT and IndoBERT, in identifying self-report messages of COVID-19 status. Both BERT and IndoBERT models were evaluated using raw and preprocessed text data from X. The study's findings revealed that the IndoBERT model exhibited superior performance, achieving an accuracy rate of 94%, whereas the BERT model achieved a performance rate of 82%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Classification Performance Comparison of BERT and IndoBERT on SelfReport of COVID-19 Status on Social Media

Abstract

Talk to us

Similar Papers

More From: Journal of Computer Sciences Institute

Lead the way for us

Journal: Journal of Computer Sciences Institute	Publication Date: Mar 20, 2024
License type: CC BY-SA 4.0

Similar Papers

Multilingual emoji prediction using BERT for sentiment analysis
Toshiki Tomihira ... Atsushi Otsuka
International Journal of Web Information Systems | VOL. 16
Toshiki Tomihira, et. al.Toshiki Tomihira ... Atsushi Otsuka
21 Sep 2020
International Journal of Web Information Systems | VOL. 16

Engineering Document Summarization Using Sentence Representations Generated by Bidirectional Language Model
Yan Jin ... Yunjian Qiu
-
Yan Jin, et. al.Yan Jin ... Yunjian Qiu
17 Aug 2021
17 Aug 2021

Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing
Sifei Han ... Fuchiang R Tsui
Journal of Biomedical Informatics | VOL. 127
Sifei Han, et. al.Sifei Han ... Fuchiang R Tsui
07 Jan 2022
Journal of Biomedical Informatics | VOL. 127

Oversampling effect in pretraining for bidirectional encoder representations from transformers (BERT) to localize medical BERT and enhance biomedical BERT
Shoya Wada ... Yasushi Matsumura
Artificial Intelligence In Medicine | VOL. 153
Shoya Wada, et. al.Shoya Wada ... Yasushi Matsumura
05 May 2024
Artificial Intelligence In Medicine | VOL. 153

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classification Performance Comparison of BERT and IndoBERT on SelfReport of COVID-19 Status on Social Media

Abstract

Talk to us

Similar Papers

More From: Journal of Computer Sciences Institute