ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy

Yassine Benajiba,Paolo Rosso,José Miguel Benedíruiz

doi:10.1007/978-3-540-70939-8_13

Abstract

The task of Named Entity Recognition (NER) allows to identify proper names as well as temporal and numeric expressions, in an open-domain text. NER systems proved to be very important for many tasks in Natural Language Processing (NLP) such as Information Retrieval and Question Answering tasks. Unfortunately, the main efforts to build reliable NER systems for the Arabic language have been made in a commercial frame and the approach used as well as the accuracy of the performance are not known. In this paper, we present ANERsys: a NER system built exclusively for Arabic texts based-on n-grams and maximum entropy. Furthermore, we present both the specific Arabic language dependent heuristic and the gazetteers we used to boost our system. We developed our own training and test corpora (ANERcorp) and gazetteers (ANERgazet) to train, evaluate and boost the implemented technique. A major effort was conducted to make sure all the experiments are carried out in the same framework of the CONLL 2002 conference. We carried out several experiments and the preliminary results showed that this approach allows to tackle successfully the problem of NER for the Arabic language.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Towards a Novel Weakly Supervised Joint Approach of Named Entity Recognition and Normalization for Noisy Text
Assia Mezhar ... Mohammed Ramdani
SSRN Electronic Journal | VOL. -
Assia Mezhar, et. al.Assia Mezhar ... Mohammed Ramdani
01 Jan 2018
SSRN Electronic Journal | VOL. -

A comprehensive study of named entity recognition in Chinese clinical text
J Lei ... M Jiang
Journal of the American Medical Informatics Association | VOL. 21
J Lei, et. al.J Lei ... M Jiang
17 Dec 2013
Journal of the American Medical Informatics Association | VOL. 21

A Novel Hybrid Approach to Arabic Named Entity Recognition
Mohamed A Meselhi ... Khaled Shaalan
-
Mohamed A Meselhi, et. al.Mohamed A Meselhi ... Khaled Shaalan
01 Jan 2014
01 Jan 2014

Towards a Novel Weakly Supervised Joint Approach of Named Entity Recognition and Normalization for Noisy Text
Assia Mezhar ... Amal El Mzabi
SSRN Electronic Journal | VOL. -
Assia Mezhar, et. al.Assia Mezhar ... Amal El Mzabi
09 May 2018
SSRN Electronic Journal | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ANERsys: An Arabic Named Entity Recognition System Based on Maximum Entropy

Abstract

Talk to us

Similar Papers