A Survey and Comparative Study of Arabic NLP Architectures

Younes Jaafar,Karim Bouzoubaa

doi:10.1007/978-3-319-67056-0_28

Abstract

Arabic Natural Language Processing (ANLP) has known a significant progress during the last years. As a result, several ANLP tools and applications have been developed such as tokenizers, Part Of Speech taggers, morphological analyzers, syntactic parsers, etc. However, most of these tools are heterogeneous and can hardly be reused in the context of other projects without modifying their source code. This problem is known to be common to all languages, that is why some advanced NLP language independent architectures have emerged such as GATE (Cunningham et al. ACL, 2002) [1] and UIMA (Apache UIMA Manuals and Guides, 2015) [2]. These architectures have significantly changed the way NLP applications are designed and developed. They provide homogenous structures for applications, better reusability and faster deployment. In this article, we present a comparative study of NLP architectures in order to specify which ones can suitably deal with Arabic language and its specificities.

Full Text