Abstract

AbstractIn this paper, we address the problem of the large coverage dictionaries of Arabic language usable both for direct human reading and automatic Natural Language Processing. For these purposes, we propose a normalized and implemented modeling, based on Lexical Markup Framework (LMF-ISO 24613) and Data Registry Category (DCR-ISO 12620), which allows a stable and well-defined interoperability of lexical resources through a unification of the linguistic concepts. Starting from the features of the Arabic language, and due to the fact that a large range of details and refinements need to be described specifically for Arabic, we follow a finely structuring strategy. Besides its richness in morphology, syntax and semantics knowledge, our model includes all the Arabic morphological patterns to generate the inflected forms from a given lemma and highlights the syntactic–semantic relations. In addition, an appropriate codification has been designed for the management of all types of relationships among lexical entries and their related knowledge. According to this model, a dictionary named El Madar1has been built and is now publicly available on line. The data are managed by a user-friendly Web-based lexicographical workstation. This work has not been done in isolation, but is the result of a collaborative effort by an international team mainly within the ISO network during a period of eight years.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.