Indonesian multilabel classification using IndoBERT embedding and MBERT classification

Ghinaa Zain Nabiilah,Muhammad Fadlan Hidayat,Eko Setyo Purwanto,Islam Nur Alam

doi:10.11591/ijece.v14i1.pp1071-1078

Ghinaa Zain Nabiilah, Muhammad Fadlan Hidayat + Show 2 more

Open Access

https://doi.org/10.11591/ijece.v14i1.pp1071-1078

Copy DOI

Abstract

The rapid increase in social media activity has triggered various discussion spaces and information exchanges on social media. Social media users can easily tell stories or comment on many things without limits. However, this often triggers open debates that lead to fights on social media. This is because many social media users use toxic comments that contain elements of racism, radicalism, pornography, or slander to argue and corner individuals or groups. These comments can easily spread and trigger users vulnerable to mental disorders due to unhealthy and unfair debates on social media. Thus, a model is needed to classify comments, especially toxic ones, in Indonesian. Transformer-based model development and natural language processing approaches can be applied to create classification models. Some previous research related to the classification of toxic comments has been done, but the classification results of the model still require exploration to get optimal results. So, this research uses the proposed model by using different pre-trained models at the embedding and classification stages, in the embedding stage using Indonesia bidirectional encoder representations from transformers (IndoBERT), and classification using multilingual bidirectional encoder representations from transformers (MBERT). The proposed model provides optimal results with an F1 value of 0.9032.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Electrical and Computer Engineering (IJECE)	Publication Date: Feb 1, 2024
Citations: 1	License type: CC BY-SA 4.0

R Discovery Prime

R Discovery Prime

Indonesian multilabel classification using IndoBERT embedding and MBERT classification

Abstract

Talk to us

Similar Papers

More From: International Journal of Electrical and Computer Engineering (IJECE)

Lead the way for us

Similar Papers

Recommendations for Social Media Use in Hospitals and Health Care Facilities
Ivy D Patdu
Philippine Journal of Otolaryngology-Head and Neck Surgery | VOL. 31
Ivy D PatduIvy D Patdu
24 Jun 2016
Philippine Journal of Otolaryngology-Head and Neck Surgery | VOL. 31

Navigating Social Media in #Ophthalmology
Edmund Tsui ... Rajesh C Rao
Ophthalmology | VOL. 126
Edmund Tsui, et. al.Edmund Tsui ... Rajesh C Rao
20 May 2019
Ophthalmology | VOL. 126

BERT base model for toxic comment analysis on Indonesian social media
Ghinaa Zain Nabiilah ... Abba Suganda Girsang
Procedia Computer Science | VOL. 216
Ghinaa Zain Nabiilah, et. al.Ghinaa Zain Nabiilah ... Abba Suganda Girsang
01 Jan 2023
Procedia Computer Science | VOL. 216

Use of Social Media as a Learning Media in 21st Century Learning
Ahmad Hidir ... Yaredi Waruwu
Al-Hijr: Journal of Adulearn World | VOL. 2
Ahmad Hidir, et. al.Ahmad Hidir ... Yaredi Waruwu
15 Nov 2023
Al-Hijr: Journal of Adulearn World | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Indonesian multilabel classification using IndoBERT embedding and MBERT classification

Abstract

Talk to us

Similar Papers

More From: International Journal of Electrical and Computer Engineering (IJECE)