A study of the performance of embedding methods for Arabic short-text sentiment analysis using deep learning approaches

Ali Alwehaibi,Marwan Bikdash,Mohammad Albogmi,Kaushik Roy

doi:10.1016/j.jksuci.2021.07.011

Abstract

Sentiment analysis aims to classify a text according to sentimental polarities of people’s opinions, such as positive, negative, or neutral. While most of the studies focus on eliciting features from English text, the research on Arabic is limited due to the morphological and grammatical complexity of Arabic language. In this paper, we proposed an optimized sentiment classification for dialectal Arabic short text at the document level using deep learning (DL). The contributions of this paper are in three areas. First, we extracted semantic features for Arabic short text at the word level and character level. Second, we used three DL topologies for classification models: a long short-term memory recurrent neural network (LSTM); a convolutional neural network (CNN); and an ensemble model combining both models’ advantages to improve the prediction performance. Third, we used a hyperparameter tuning estimation method to optimize the neural network performance. We trained and tested our proposed models on a dataset that consists of Modern Standard Arabic and dialectal Arabic corpus collected from Twitter. The results showed significant improvement in Arabic text classification in term of classification accuracy that ranges between 88% and 69.7%. The ensemble model scored the highest accuracy of 96.7% on the test set.

Full Text