Abstract

Sentiment Analysis, or Opinion Mining, has recently captivated the interest of scientists worldwide. With the increasing use of the internet, the web is becoming overloaded by data that contains useful information, which can be used in different fields. In fact, many studies have shed light on Sentiment Analysis of online data in different languages. However, the amount of research dealing with the Arabic language is still limited. In this paper, an empirical study is led to Sentiment Analysis of online reviews written in Modern Standard Arabic. A new system called SSAAR (System for Sentiment Analysis of Arabic Reviews) is proposed, allowing computational classification of reviews into three classes (positive, negative, neutral). The input data of this system is built by using a proposed framework called SPPARF (Scraping and double Preprocessing Arabic Reviews Framework), which generates a structured and clean dataset. Moreover, the provided system experiments two improved approaches for sentiment classification based on supervised learning, which are: Double preprocessing method and Feature selection method. Both approaches are trained by using five algorithms (Naïve Bayes, stochastic gradient descent Classifier (SGD), Logistic Regression, K-Nearest Neighbors, and Random Forest) and compared later under the same conditions. The experimental results show that the feature selection method using the SGD Classifier performs the best accuracy (77.1%). Therefore, the SSAAR System proved to be efficient and gives better results when using the feature selection method; nevertheless, satisfying results were obtained with the other approach, considered consequently suitable for the proposed system.

Highlights

  • Using the internet to express opinions or share information about different topics has become essential for people all around the world in their daily lives, especially among people speaking Arabic language

  • We conducted an empirical study of sentiment Analysis of online reviews written in Modern Standard Arabic and built a System called SSAAR (System for Sentiment Analysis of Arabic Reviews)

  • To obtain a performing system, we by ourselves construct the SPPARF Framework which is specific scraper and preprocessing framework adapted for Modern Standard Arabic (MSA) Arabic data collection and advanced preprocessing, used as an input framework for our proposed system

Read more

Summary

Introduction

Using the internet to express opinions or share information about different topics has become essential for people all around the world in their daily lives, especially among people speaking Arabic language. An increasing interest has been shown recently by many researchers in analyzing this huge amount of information available on the web, including public opinions expressed on blogs, websites and social network, etc.By the way, for decision making and improving their trading activities, many companies resort to opinion mining. It is the same case for governments, which resort to public citizens’ opinions to improve the quality of services and information offered to them. Studies aiming opinion mining or sentiment analysis from Arabic websites or social media stay limited and more research is required to solve many issues related to this topic, due to challenges in processing natural Arabic language with its complex morphology (a Morphologically Rich Language), and the use of different types of dialectical Arabic, Arabizi (Arabic written in Latin script combined with Arabic numerals)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.