Arabic Named Entity Recognition Research Articles

AbstractModelling the distributional semantics of such a morphologically rich language as Arabic needs to take into account its introflexive, fusional, and inflectional nature attributes that make up its combinatorial sequences and substitutional paradigms. To evaluate such word distributional models, the benchmarks that have been used thus far in Arabic have mimicked those in English. This paper reports on a benchmark that we designed to reflect linguistic patterns in both Contemporary Arabic and Classical Arabic, the first being a cover term for written and spoken Modern Standard Arabic, while the second for pre-modern Arabic. The analogy items we included in this benchmark are chosen in a transparent manner such that they would capture the major features of nouns and verbs; derivational and inflectional morphology; high-, middle-, and low-frequency patterns and lexical items; and morphosemantic, morphosyntactic, and semantic dimensions of the language. All categories included in this benchmark are carefully selected to ensure proper representation of the language. The benchmark consists of 45 roots of the trilateral, all-consonantal, and semivowel-inclusive types; six morphosemantic patterns (’af‘ala; ifta‘ala; infa‘ala; istaf‘ala; tafa‘‘ala; and tafā‘ala); five derivations (the verbal noun, active participle, and the contrasts in Masculine-Feminine; Feminine-Singular-Plural; Masculine-Singular-Plural); and morphosyntactic transformations (perfect and imperfect verbs conjugated for all pronouns); and lexical semantics (synonyms, antonyms, and hyponyms of nouns, verbs, and adjectives), as well as capital cities and currencies. All categories include an equal proportion of high-, medium-, and low-frequency items. For the purpose of validating the proposed benchmark, we developed a set of embedding models from different textual sources. Then, we tested them intrinsically using the proposed benchmark and extrinsically using two natural language processing tasks: Arabic Named Entity Recognition and Text Classification. The evaluation leads to the conclusion that the proposed benchmark is truly reflective of this morphologically rich language and discriminatory of word embeddings.

Read full abstract

Recurrent Neural Networks (RNNs) and transformers are deep learning models that have achieved remarkable success in several Natural Language Processing (NLP) tasks since they do not rely on handcrafted features nor enormous knowledge resources. Named Entity Recognition (NER) is an essential NLP task that is used in many applications such as information retrieval, question answering, and machine translation. NER aims to locate, extract, and classify named entities into predefined categories such as person, organization and location. Arabic NER is considered a challenging task because of the complexity and the unique characteristics of Arabic. Most of the previous research on deep learning based-Arabic NER focused on Modern Standard Arabic and Dialectal Arabic, which are different variations from Classical Arabic. In this paper, we investigate deep learning-based Classical Arabic NER using different deep neural network architectures and a BERT based contextual language model that is trained on general domain Arabic text. We propose two RNN-based models by fine-tunning the pretrained BERT language model to recognize and classify named entities from Classical Arabic. The pre-trained BERT contextual language model representations were used as input features to a BGRU/BLSTM model and were fine-tuned using a Classical Arabic NER dataset. In addition, we explore variant architectures of the proposed BERT-BGRU/BLSTM-CRF models. Experimentations showed that the BERT-BGRU-CRF model outperformed the other models by achieving an F-measure of 94.76% on the CANERCorpus. To the best of our knowledge, this is the first work that aims to recognize named entities in Classical Arabic using deep learning.

Read full abstract

Arabic Named Entity Recognition Research Articles

Related Topics

Articles published on Arabic Named Entity Recognition

Voting Strategies for Arabic Named Entity Recognition using Annotation Schemes

A Benchmark Evaluation of Multilingual Large Language Models for Arabic Cross-Lingual Named-Entity Recognition

Named Entity Recognition of Tunisian Arabic Using the Bi-LSTM-CRF Model

A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future Trends

Advancements in Arabic Named Entity Recognition: A Comprehensive Review

Arabic NER Evaluation: Pre-Trained Models via Contrastive Learning vs. LLM Few-Shot Prompting

Improving Arabic Named Entity Recognition with a Modified Transformer Encoder

A benchmark for evaluating Arabic word embedding models

Boosting Arabic Named Entity Recognition with K-Fold Cross Validation on LSTM and Bi-LSTM Models

Towards an approach based on particle swarm optimization for Arabic named entity recognition on social media

RETRACTED: Arabic named entity recognition in social media based on BiLSTM-CRF using an attention mechanism

New approach for Arabic named entity recognition on social media based on feature selection using genetic algorithm

Classical Arabic Named Entity Recognition Using Variant Deep Neural Network Architectures and BERT

Data Augmentation Techniques on Arabic Data for Named Entity Recognition

Automatic Arabic Named Entity Extraction and Classification for Information Retrieval

Recognition System for Libyan Entity Names

Extracting Arabic Composite Names Using Genitive Principles of Arabic Grammar

A Recent Survey of Arabic Named Entity Recognition on Social Media

A Comparison between Conditional Random Field and Structured Support Vector Machine for Arabic Named Entity Recognition

Transfer Learning for Arabic Named Entity Recognition With Deep Neural Networks

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Arabic Named Entity Recognition Research Articles

Related Topics

Articles published on Arabic Named Entity Recognition

Voting Strategies for Arabic Named Entity Recognition using Annotation Schemes

A Benchmark Evaluation of Multilingual Large Language Models for Arabic Cross-Lingual Named-Entity Recognition

Named Entity Recognition of Tunisian Arabic Using the Bi-LSTM-CRF Model

A Survey on Arabic Named Entity Recognition: Past, Recent Advances, and Future Trends

Advancements in Arabic Named Entity Recognition: A Comprehensive Review

Arabic NER Evaluation: Pre-Trained Models via Contrastive Learning vs. LLM Few-Shot Prompting

Improving Arabic Named Entity Recognition with a Modified Transformer Encoder

A benchmark for evaluating Arabic word embedding models

Boosting Arabic Named Entity Recognition with K-Fold Cross Validation on LSTM and Bi-LSTM Models

Towards an approach based on particle swarm optimization for Arabic named entity recognition on social media

RETRACTED: Arabic named entity recognition in social media based on BiLSTM-CRF using an attention mechanism

New approach for Arabic named entity recognition on social media based on feature selection using genetic algorithm

Classical Arabic Named Entity Recognition Using Variant Deep Neural Network Architectures and BERT

Data Augmentation Techniques on Arabic Data for Named Entity Recognition

Automatic Arabic Named Entity Extraction and Classification for Information Retrieval

Recognition System for Libyan Entity Names

Extracting Arabic Composite Names Using Genitive Principles of Arabic Grammar

A Recent Survey of Arabic Named Entity Recognition on Social Media

A Comparison between Conditional Random Field and Structured Support Vector Machine for Arabic Named Entity Recognition

Transfer Learning for Arabic Named Entity Recognition With Deep Neural Networks