Abstract

AbstractNamed entities recognition (NER) has become, over time, a potentially helpful pre-processing for several NLP tasks. It mainly identifies and classifies entities in the text into predefined categories, such as a person, location, and organization. Most of the work done in Arabic named entity recognition (ANER) has focused on Modern Standard Arabic (MSA). However, most scripts on the internet, especially in social media (which are currently a source for corpus development), are in a dialectal form that does not follow standard writing rules. This paper investigates the possibility of deep learning based on named entity recognition in Algerian dialect script through a comparative study between 5 models: AraBERT, DziriBERT, MARBERT, ARBERT, and mBERT. We chose these five models for two significant reasons; the first one, that these models are already pre-trained on the MSA and Arabic dialect text, the second reason, that they have proved their efficiency for other tasks such as Part of Speech Tagging, sentiment analysis, and ANER.KeywordsNLPDeep learningAlgerian dialectNamed Entity RecognitionSocial media

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call