Comparison of sentiment lexicon development techniques for event prediction

Mehmet Kaya,Shannon Conley

doi:10.1007/s13278-015-0315-8

Abstract

What started as a social utility for sharing short bursts of ‘inconsequential information’ has become a powerful information network capable of both tracking and shaping current events. From orchestrating government insurgencies to tracking epidemics, the majority of information shared via Twitter contains semantic relevance to contemporary topic(s), according to recent statistics. And, in consequence, Twitter is considered by researchers as an ideal platform for sentiment analysis. Compared to other online arenas such as forum discussions, blogs, and Facebook postings, Twitter frequently yields a higher degree of sentiment analysis accuracy due to the shortness of each post (140 character limit per Tweet). Various natural language processing techniques have been used to successfully perform sentiment classification on a group of Tweets. However, these techniques analyze text using both English-specific grammar rules and lexicons. Since there are fewer resources or tools in other languages, researchers often attempt to first use machine translation to translate the text into English. Often, translation errors introduce noise that obfuscates the results. In this study, we are analyzing the accuracy of sentiment analysis using an ad hoc and a translated sentiment lexicon in terms of capability of predicting the results of a future occurrence. We collected some 22,000 tweets using Twitter Search and Streaming APIs regarding a highly popular TV Show called “O Ses Turkiye” to predict the winner (Turkish version of globally known voice contest “The Voice of America”). We first performed a frequency-based statistical classification using an English sentiment lexicon translated into Turkish as well as a small ad hoc Turkish sentiment lexicon generated specifically for this study. We also use a k-means clustering technique using the two sentiment lexicons to evaluate the accuracies. Our study concludes that although using a translated sentiment lexicon (or training data for that matter) can also give a rough estimate for the result of a future event successfully, a language-specific ad hoc lexicon yields better granularity with higher discriminative power between negative, positive and neutral tweets. We also show the effect of automatic spell check and stemming in tweets on the predictive and discriminative power of auto-translated sentiment lexicon on a target language.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparison of sentiment lexicon development techniques for event prediction

Abstract

Talk to us

Similar Papers

More From: Social Network Analysis and Mining

Lead the way for us

Journal: Social Network Analysis and Mining	Publication Date: Jan 12, 2016
Citations: 5

Similar Papers

Sinhala Sentiment Analysis using Corpus based Sentiment Lexicon
P. D. T. Chathuranga ... S. A. S. Lorensuhewa
-
P. D. T. Chathuranga, et. al.P. D. T. Chathuranga ... S. A. S. Lorensuhewa
01 Sep 2019
01 Sep 2019

Bias aware lexicon-based Sentiment Analysis of Malay dialect on social media data: A study on the Sabah Language
Mohd Hanafi Ahmad Hijazi ... Frans Coenen
-
Mohd Hanafi Ahmad Hijazi, et. al.Mohd Hanafi Ahmad Hijazi ... Frans Coenen
01 Oct 2016
01 Oct 2016

영역별 맞춤형 감성사전 구축을 통한 영화리뷰 감성분석
Sang Hoon Lee ... Jing Cui
Journal of Intelligence and Information Systems | VOL. 22
Sang Hoon Lee, et. al.Sang Hoon Lee ... Jing Cui
30 Jun 2016
Journal of Intelligence and Information Systems | VOL. 22

Does BERT Look at Sentiment Lexicon?
Elena Razova ... Sergey Vychegzhanin
-
Elena Razova, et. al.Elena Razova ... Sergey Vychegzhanin
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparison of sentiment lexicon development techniques for event prediction

Abstract

Talk to us

Similar Papers

More From: Social Network Analysis and Mining