Abstract

Relation extraction from unstructured Arabic text is especially challenging due to the Arabic language complex morphology and the variation in word semantics and lexical categories. The research documented in this paper presents a hybrid Semantic Knowledge base - Machine Learning (SKML) approach for extracting complex Arabic relations from unstructured Arabic documents; the proposed approach exploits the principles of Functional Discourse Grammar (FDG) to emphasise the semantic and pragmatic properties of the language and facilitate the identification of relation elements. At the initial phase, the novel FDG-SKML relation extraction approach deploys a lexical-based mechanism that utilises a purposely built domain-specific Semantic Knowledge to encode the semantic association between the identified relations’ elements. The evaluation of the initial stage evidenced improved accuracy for extracting most complex Arabic relations. The initial relation extraction mechanism was further extended by integrating its output into a Machine Learning classifier that facilitated extracting especially complex relations with significant disparity in the relation elements’ presence, order, and correlation. Using Economics as the problem domain, experimental evaluation evidenced the high accuracy of our FDG-SKML approach in complex Arabic relation extraction task and demonstrated its further improvement upon integration with machine learning classifiers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call