A Comparative Study on Various Deep Learning Techniques for Arabic NLP Syntactic Tasks on Noisy Data

Shaima A Abushaala,Mohammed M Elsheh

doi:10.1109/mi-sta54861.2022.9837659

Shaima A Abushaala, Mohammed M Elsheh

https://doi.org/10.1109/mi-sta54861.2022.9837659

Copy DOI

Export

Save

Cite

Publication Date: May 23, 2022

Affiliation: Misurata University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Natural language processing (NLP) has three basic tasks divided into two levels, lexical, which includes Tokenization task and syntactic level which includes Part Of Speech tasks (POS) and Name Entity Recognition (NER) tasks. Recent research has demonstrated the effectiveness of deep learning in many NLP tasks including NER, POS, sentiment analysis, language modeling, and other tasks. This study focused on the utilizing of Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BLSTM), Bidirectional Long Short-Term Memory with Conditional Random Field (BLSTM-CRF), and Long Short-Term Memory with Conditional Random Field (LSTM-CRF) deep learning techniques for tasks at the syntactic level; and to compare their performance on noisy data. The models were trained and tested by using KALIMAT corpus with simulated noise on testing dataset. For evaluation purpose, F1-score was used, where the results of our experiments showed that a BLSTM-CRF model surpassed the rest of other models in the NER task at a low noise level, while the LSTM-CRF model obtained a higher F1-score at a higher noise level. With respect to the POS task, the BLSTM-CRF model gained the highest F1-score in all noise levels equated to the other competitive models.

Full Text