ABioNER: A BERT‐Based Model for Arabic Biomedical Named‐Entity Recognition

Nada Boudjellal,Asif Khan,Huaping Zhang,Arshad Ahmad,Rashid Naseem,Lin Dai,Jianyun Shang

doi:10.1155/2021/6633213

Nada Boudjellal, Asif Khan + Show 5 more

Open Access

PDF Available

https://doi.org/10.1155/2021/6633213

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

The web is being loaded daily with a huge volume of data, mainly unstructured textual data, which increases the need for information extraction and NLP systems significantly. Named‐entity recognition task is a key step towards efficiently understanding text data and saving time and effort. Being a widely used language globally, English is taking over most of the research conducted in this field, especially in the biomedical domain. Unlike other languages, Arabic suffers from lack of resources. This work presents a BERT‐based model to identify biomedical named entities in the Arabic text data (specifically disease and treatment named entities) that investigates the effectiveness of pretraining a monolingual BERT model with a small‐scale biomedical dataset on enhancing the model understanding of Arabic biomedical text. The model performance was compared with two state‐of‐the‐art models (namely, AraBERT and multilingual BERT cased), and it outperformed both models with 85% F1‐score.

Highlights

Being in the era of digital information, where the web is being loaded with a large volume of data daily, the need for information extraction and natural language processing (NLP) systems is increasing significantly
Our work contributions can be resumed as follows: (i) We show that pretraining a monolingual BERT model on a small-scale domain-specific dataset can still improve the performance of the model on it (ii) Our model achieved better performance on the bioNER task for the Arabic language, outperforming original multilingual BERT and AraBERT models (iii) To the best of our knowledge, this is the first work for Arabic biomedical named-entity recognition (NER) of this kind
To prove the effectiveness of ABioNER, we compared it with AraBERT and BERT multilingual cased models

Summary

Introduction

Being in the era of digital information, where the web is being loaded with a large volume of data daily (mainly, unstructured text data), the need for information extraction and natural language processing (NLP) systems is increasing significantly. E biomedical domain has a special and complex structure for named entities as compared to other open text domains. Despite these complexities, it is still witnessing drastic progress in information extraction applications. The Arabic language structure is highly agglutinative with a lack of vowels that are replaced with diacritics, the latter when missing, creates ambiguity [2]. Another challenge is spelling variations of transliterated words

Objectives

Methods

Results

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Complexity	Publication Date: Jan 1, 2021
Citations: 35	License type: CC BY 4.0

R Discovery Prime

ABioNER: A BERT‐Based Model for Arabic Biomedical Named‐Entity Recognition

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Complexity

Lead the way for us

Similar Papers

Finding the best trade-off between performance and interpretability in predicting hospital length of stay using structured and unstructured data.
Franck Jaotombo ... Laurent Boyer
PLOS ONE | VOL. 18
Franck Jaotombo, et. al.Franck Jaotombo ... Laurent Boyer
30 Nov 2023
PLOS ONE | VOL. 18

Limitations of information extraction methods and techniques for heterogeneous unstructured big data
Kiran Adnan ... Rehan Akbar
International Journal of Engineering Business Management | VOL. 11
Kiran Adnan, et. al.Kiran Adnan ... Rehan Akbar
01 Jan 2019
International Journal of Engineering Business Management | VOL. 11

Identification and Prediction of Human Behavior through Mining of Unstructured Textual Data
Mohammad Reza Davahli ... Tareq Ahram
Symmetry | VOL. 12
Mohammad Reza Davahli, et. al.Mohammad Reza Davahli ... Tareq Ahram
19 Nov 2020
Symmetry | VOL. 12

BERT-BiGRU Intelligent Classification of Metro On-Board Equipment Faults Based on Key Layer Fusion
Endong Liu ... Junting Lin
Wireless Communications and Mobile Computing | VOL. 2022
Endong Liu, et. al.Endong Liu ... Junting Lin
25 Jun 2022
Wireless Communications and Mobile Computing | VOL. 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

ABioNER: A BERT‐Based Model for Arabic Biomedical Named‐Entity Recognition

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Complexity