Abstract
Named Entity Recognition (NER) is a clue task for improving the automatic text processing, required in a wide variety of applications. NER techniques range from hand-crafted rules to machine learning approaches. In this paper, we describe the development and the implementation of an Arabic Named Entity Recognition (ANER) System, based on machine learning approach. We use SVM classifier with a set of dependent and independent language features. We also investigate the use of patterns to ameliorate the ANER task by implementing an automatic pattern extractor framework based on Part Of Speech (POS) Information and linguistic filters. Finally, we explore the impact of several features combinations on the performances of the developed system. Our system achieves an overall average F-measure value of 83,20%. To measure the effectiveness of the developed ANER system in Topic Detection (TD) context, we conduct several experiments using NEs as features. The obtained results were very encouraging.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have