A Novel Fusion of Machine Learning Methods for Enhancing Named Entity Recognition in Indonesian Language Text

Widyawan Widyawan,Muhammad Nur Rizala,Bayu Prasetiyo Utomo

doi:10.21456/vol14iss4pp311-320

Abstract

One of the important implementations in machine learning is Named Entity Recognition (NER), which is used to process text and extract entities such as people, organizations, laws, religions, and locations. NER for the Indonesian language still faces significant challenges due to the lack of high-quality labelled datasets, which limits the development of more advanced models. To address this issue, we utilized several pre-trained BERT models (bert-base-uncased, indobenchmark/indobert-base-p1, indolem/indobert-base-uncased) and datasets (NERGRIT-IndoNLU, NERGRIT-Corpus, NERUGM, and NERUI). This study proposes a novel fusion approach by integrating deep learning architectures such as CNN, Bi-LSTM, Bi-GRU, and CRF to detect 19 entities. This approach enhances BERT’s sequence modelling and feature extraction capabilities, while CRF improves entity prediction by enforcing global word-sequence constraints. Experimental results demonstrate that the fusion approach outperforms previous methods. On the bert-base-uncased dataset, accuracy reached 94.75%, while indobenchmark/indobert-base-p1 achieved 95.75%, and indolem/indobert-base-uncased achieved 95.85%. This study emphasizes the effectiveness of combining deep learning architectures with pre-trained transformers to improve NER performance in the Indonesian language. The proposed methodology offers significant advancements in entity extraction for languages with limited datasets, such as Indonesian.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Novel Fusion of Machine Learning Methods for Enhancing Named Entity Recognition in Indonesian Language Text

Abstract

Talk to us

Similar Papers

More From: Jurnal Sistem Informasi Bisnis

Lead the way for us

Similar Papers

Domain-Adaptive Pre-training BERT Model for Test and Identification Domain NER Task
Bo Wang ... Yaofeng Su
Journal of Physics: Conference Series | VOL. 2363
Bo Wang, et. al.Bo Wang ... Yaofeng Su
01 Nov 2022
Journal of Physics: Conference Series | VOL. 2363

Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model
Ren Li ... Di Wang
Advanced Engineering Informatics | VOL. 50
Ren Li, et. al.Ren Li ... Di Wang
15 Sep 2021
Advanced Engineering Informatics | VOL. 50

P-078. Profile of cesarean delivery care in hypertensive pregnant women
Isabella Delgado ... Andressa Uvina
Pregnancy Hypertension | VOL. 25
Isabella Delgado, et. al.Isabella Delgado ... Andressa Uvina
01 Sep 2021
Pregnancy Hypertension | VOL. 25

Named entity recognition for extracting concept in ontology building on Indonesian language using end-to-end bidirectional long short term memory
Joan Santoso ... Mauridhi Hery Purnomo
Expert Systems with Applications | VOL. 176
Joan Santoso, et. al.Joan Santoso ... Mauridhi Hery Purnomo
13 Mar 2021
Expert Systems with Applications | VOL. 176

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Fusion of Machine Learning Methods for Enhancing Named Entity Recognition in Indonesian Language Text

Abstract

Talk to us

Similar Papers

More From: Jurnal Sistem Informasi Bisnis