Abstract

Anaphora resolution is a commonly studied research area of Natural Language Processing (NLP). It is crucial for many application areas of Natural Language Processing including information extraction, question answering and text summarization. Most of the earlier work done in the field of anaphora resolution is for English and other European languages. Arabic language is not sufficiently studied with respect to anaphora resolution and rarely being subjected to machine learning experiments. In this paper we present a machine learning approach to resolve the pronominal anaphora in Arabic language. In this work we determine the appropriate features to be used in this task. We consider a number of classifier namely naive Bayes, K-nearest neighbors and linear logistic regression are employed as base-classifiers for each of the feature sets. In this paper, an in-depth study has been conducted on different of feature sets for exploiting effective features and investigating their effect on performance of the Anaphora resolution. Finally, a wide range of comparative experiments on Quranic datasets are conducted, The experimental results on the Arabic Quran training corpus demonstrate that the proposed method is feasible for the pronominal anaphora resolution of Arabic.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call