Abstract

Sentiment analysis is a crucial Natural Language Processing task to analyze the user’s emotions and opinions towards events, entities, services, or products. Arabic NLP faces numerous challenges, some of which include: (1) the scarcity of resources, especially in modern standard Arabic and Arabic dialects, particularly the Bahraini one; (2) lack of multilingual deep learning models; and (3) insufficient transfer learning studies on Arabic dialects in general and Bahraini dialects specifically. This research aims to create a balanced dataset of Bahraini dialects that covers product reviews by translating English Amazon product reviews to modern standard Arabic, which were then converted to Bahraini dialects. Another aim of this research is to provide a reusable multilingual deep learning long short term memory model to analyze the parallel dataset of English, modern standard Arabic, and Bahraini dialects, which differ in linguistic properties. Many experiments were conducted using train-validate-test split and k-fold cross-validation to evaluate the model performance using accuracy, F1 score, and AUC metrics. The model runs average accuracy on all datasets ranging from 96.72% to 97.04%, 97.91% to 97.93% in F1 score, while in AUC was 98.46% to 98.7% when utilizing an augmentation technique. Moreover, a pre-trained Long Short Term Memory model was created to exploit and transfer the knowledge gained from analyzing the product reviews in Bahraini dialects to perform sentiment analysis on a small dataset of movie comments in the same dialects. The Pre-trained model performance was 96.97% accuracy, 96.65% F1 score, and 97.94% AUC.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.