Abstract
This paper is devoted to the development of Arabic Text Categorization System. First, a stop-words list is generated using statistical approach which captures the inflation of different Arabic words. Second, a feature representation model based on Hidden Markov Model is developed to extract roots and morphological weights. Third, a semantic synonyms merge technique is presented for feature reduction. Finally a Dewey-Index Based Back-propagation Artificial Neural Network is developed for Arabic Document Categorization. The system was compared with other classifiers and the results reveal a promising architecture.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have