Sentiment Analysis on Moroccan Dialect based on ML and Social Media Content Detection

Mouaad Errami,Rabia Rachidi,Soufiane Hamida,Bouchaib Cherradi,Mohamed Amine Ouassil,Abdelhadi Raihani

doi:10.14569/ijacsa.2023.0140347

Abstract

As technology continues to evolve, humans tend to follow suit, and currently social media has taken place as the defacto method of communication. As it tends to happen with verbal communication, people express their opinions in written form and through an analysis of their words, one can extract what an individual wants from a product, a topic, or an event. By looking at the emotions expressed in such content, governments, businesses, and people can learn a lot that can help them improve their strategies. Therefore, in this study, we will use different algorithms to improve the Moroccan sentiment classification. The first step is to gather and prepare Moroccan Dialectal Arabic Twitter comments. Then, a lot of different combinations of extraction (n-grams) and weighting schemes (BOW/ TF-IDF) and word embedding for feature construction are applied to get the best classification models. We used Naive Bayes, Random Forests, Support Vector Machines, and Logistic regression and LSTM to classify the data we prepared. Our machine learning approach, which incorporates sentiment analysis, was designed to analyze Twitter comments written in Modern Standard Arabic or Moroccan Dialectal Arabic. As a final benchmark of our paper, we were simply a sliver shy away from the 70% mark in our accuracy by relying on the SVM algorithm. Although not a game-changing result, this was enough to encourage us to continue developing our model further.

Full Text