Abstract

Arabic text classification is one application of Natural Language Processing (NLP). It has been used to analyze and categorize Arabic text. Analyzing text has become an essential part of our lives because of the increasing number of text data which makes text classification a big data problem. Arabic text classification systems become significant to maintain vital information in many domains such as education, and health sector, and public services. In the presented research work, the Arabic text classification model is developed using various algorithms namely Multinomial Naïve Bayesian (MNB), Bernoulli Naïve Bayesian (BNB), Stochastic Gradient Descent (SGD), Logistic Regression (LR), Support vector classifier (SVC), Linear SVC, and convolutional neural networks (CNN). These algorithms have been implemented utilizing the Al-Khaleej dataset. The experiments are carried out with various representation models and it is observed that CNN with character level model outperforms others. The result of CNN exceeds the state-of-the-art machine learning method with an accuracy equal to 98. The presented methods will be useful in different domains, particularly on social media.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call