With the advent of AI text-based tools and applications, the need to introduce and investigate word-processing tools has also been raised. NLP tools and techniques have developed rapidly for some languages, such as English. However, other languages, such as Arabic, still need to introduce more methods and techniques to provide more explanations. In this study, we present a sample to classify customer reviews which are written in Arabic. The data set (HARD) is used to be certified as a dataset for work. This study adopted four classifications in machine learning and deep learning (CNN, RNN, NB, LR). In addition, the texts were cleaned using data cleaning techniques, and the stemming technique was used, and three types of them were implemented (Khoja Stemmer, Snowball Stemmer, Thashaphyne Stemmer). Moreover, two methods of feature extraction were used (TF-IDF, N-gram). The results of the model provided several explanations. The best performance resulted from the use of (CNN+ Snowball Stemmer +N-gram) with accuracy (%93.5). The results of the model stated that some workbooks are sensitive to the use of different tools, and some accuracy performance can also be affected if there are different methods for extracting the features used. Either feature extraction has an impact on accuracy performance. The model also proved that colloquial Arabic could cause some limitations because different dialects can give different meanings across different regions or countries. The results of the study open the door to exploring other tools and methods to enrich natural Arabic language processing and contribute to the development of new applications that support Arabic content.
Read full abstract