Abstract

Majority of studies on sentiment analysis field, specifically Arabic lexicon-based approach, are focused on doing preprocessing methods on targeted dataset text or collected textual data from Twitter (Twitter dataset) rather than dealing with lexicon itself. This study proposes a new method, we constraint firstly on building a new sentiment lexicon with reasonable number of words and then doing adequate preprocessing methods on the lexicon’s words in addition to the (Twitter dataset). The study presents Saudi Dialect Sentiment lexicon called SaudiSentiPlus contains 7139 words which mostly generated from Saudi tweets and other dictionaries. Moreover, this study also presents two lexicon-based algorithms for Saudi dialect to deal with (prefixes and suffixes) letters in order to increase performance of proposed Saudi dialect lexicon. The experiment which has been conducted in this study to evaluate the performance of SaudiSentiPlus comprises four phases. The precision, recall, accuracy, and F-Score are measured in every phase. We built our testing dataset from twitter by focusing on Saudi dialect hashtags (971 thousands tweets from 162 hashtags). The results, show that accuracy of SaudiSentiPlus with the two lexicon-based algorithms reached to 81%.

Highlights

  • A Social Network Site (SNS) is a platform enables people to share their opinions on any issue and to build social relations with individuals within and beyond their social circle [1].Twitter as a one of the most popular SNSs that has been growing rapidly in recent years

  • We evaluated the performance of SaudiSentiPlus through four phases

  • Due to that Saudi dialect words originally and mostly are extracted from Arabic language words and Arabic language is a morpho¬logical language and their words might be varied depending on the presence and position of some well-known letters in a word. Some of these letters come at the beginning or end of a word

Read more

Summary

Introduction

A Social Network Site (SNS) is a platform enables people to share their opinions on any issue and to build social relations with individuals within and beyond their social circle [1].Twitter as a one of the most popular SNSs that has been growing rapidly in recent years. Twitter‟s users increased by more than 500% since 2009 [1]. Twitter‟s users express their feeling, opinions or spreads news or facts about 200 billion times annually via their tweets, 500 million of them per day, 350,000 per minute, and 6000 per second [2]. In 2014, total number of active Twitter users in the Arab world reached 5,797,500 users and the country with the highest number of active Twitter users in the Arab region is Saudi Arabia with 2.4 million users, accounting for over 40% of all active Twitter users in the Arab region. The estimated number of tweets produced by Twitter users in the Arab world in March 2014 was 533,165,900 tweets, an average of 17,198,900 tweets per day [3] (see Fig. 1)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call