Abstract

The nature of the Quran and its translations as classic Arabic and English texts reduces the accuracy of ordinary natural language processing tools such as pronominal anaphora resolution systems. Pronominal anaphora resolution simply involves finding an antecedent for anaphoric pronouns as the referring expressions of discourse. The performance of a pronominal anaphora resolution system is vitally related to the efficiency of pre-processing tools that analyze and prepare the input data for feeding the resolution algorithm. This paper proposes a novel pre-processing approach for pronoun extraction and pronoun mapping in the pronominal anaphora resolution system of English translations of the Quran, which facilitates the anaphora resolution, specifically for the English pronouns without an explicit antecedent that contributes close to 50% of the anaphoric relations in the Quran. This approach uses the morphologic, statistic and anaphoric knowledge that is extracted from the Arabic corpus of the Quran. For evaluating the arrangement, 1% of an English translation was annotated with labeling for all anaphoric and non-anaphoric English pronouns. These pronouns were aligned to the equivalent Arabic pronouns and linked to the concepts in the Arabic text. Through statistical results, it was shown that our rule-based pre-processing tools perform well. The precision, recall, and accuracy of pronoun extraction stage are 96.38%, 100%, and 99.5%, respectively. The result of mapping algorithm is promising whereby we score 85.51% in precision, 96.32% in recall, and 82.81% in accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call