Abstract

Arabic topic detection (ATD) has become an attractive research field. It is used in many applications, such as Arabic documents classification, web search, social media, and security. ATD uses machine learning algorithms with ultimate aim to classify Arabic documents based on text contents. Arabic text classification require a complicated process. The Arabic words have unlimited variation in the meaning, which add more complexity and ambiguity to the process Arabic text classification. There are some studies have been proposed for Arabic text classification in recent years. However, these previous studies need improvements to rise accuracy and efficiency. Therefore, this paper proposes an effective approach for Arabic text classification and topic detection using discriminative multi nominal naive Bayes (DMNB) classifier and frequency transform. The proposed approach includes three main steps: Arabic text preprocessing, Arabic text feature extraction and normalization, and Arabic text classification. A dataset of 1500 Arabic documents collected from Arabic articles corpus in 5 different topics is used to evaluate the proposed approach. The experimental results of 10-folds cross-validation show that the proposed approach performs competitively better than the state-of-the-art approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call