Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

Łukasz Augustyniak,Piotr Szymański,Włodzimierz Tuligłowicz,Tomasz Kajdanowicz

doi:10.3390/e18010004

Łukasz Augustyniak, Piotr Szymański + Show 2 more

Open Access

https://doi.org/10.3390/e18010004

Copy DOI

Journal: Entropy	Publication Date: Dec 25, 2015
Citations: 57	License type: CC BY 4.0

Affiliation: Wrocław University of Science and Technology

Abstract

We propose a novel method for counting sentiment orientation that outperforms supervised learning approaches in time and memory complexity and is not statistically significantly different from them in accuracy. Our method consists of a novel approach to generating unigram, bigram and trigram lexicons. The proposed method, called frequentiment, is based on calculating the frequency of features (words) in the document and averaging their impact on the sentiment score as opposed to documents that do not contain these features. Afterwards, we use ensemble classification to improve the overall accuracy of the method. What is important is that the frequentiment-based lexicons with sentiment threshold selection outperform other popular lexicons and some supervised learners, while being 3–5 times faster than the supervised approach. We compare 37 methods (lexicons, ensembles with lexicon’s predictions as input and supervised learners) applied to 10 Amazon review data sets and provide the first statistical comparison of the sentiment annotation methods that include ensemble approaches. It is one of the most comprehensive comparisons of domain sentiment analysis in the literature.

Highlights

Sentiment analysis of texts means assigning a measure on how positive, neutral or negative the text is
In this paper we would like to present the continuation of our work presented in [27], where we have used sentiment lexicons as first stage classifiers and employed a decision tree as a fusion classifier, which learned based on the output of the lexicons
We propose a new method for lexicon generation—frequentiment—based on likelihood increased, when the document contains a given feature averaged by score per feature

Summary

Introduction

Sentiment analysis of texts means assigning a measure on how positive, neutral or negative the text is. It can be performed by experts, automatically or both, as different sentiment classifications can be treated as input to improve accuracy. To increase the accuracy of these results, different annotators would annotate a given text and check how many annotations gave the same result. What lies behind such an approach is the intuition that if more people give the same response to the same text, the probability that the response is correct rises. On the other hand this approach is expensive, time consuming and may require sophisticated methods of selecting annotators to attain a real rise in accuracy

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Determination of Patient Sentiment and Emotion in Ophthalmology: Infoveillance Tutorial on Web-Based Health Forum Discussions.
Anne Xuan-Lan Nguyen ... Albert Y Wu
Journal of medical Internet research | VOL. 23
Anne Xuan-Lan Nguyen, et. al.Anne Xuan-Lan Nguyen ... Albert Y Wu
17 May 2021
Journal of medical Internet research | VOL. 23

Improving the performance of lexicon-based review sentiment analysis method by reducing additional introduced sentiment bias.
Hongyu Han ... Yongshi Zhang
PLOS ONE | VOL. 13
Hongyu Han, et. al.Hongyu Han ... Yongshi Zhang
24 Aug 2018
PLOS ONE | VOL. 13

Evaluating the Applicability of Existing Lexicon-Based Sentiment Analysis Techniques on Family Medicine Resident Feedback Field Notes: Retrospective Cohort Study.
Kevin Jia Qi Lu ... Fok-Han Leung
JMIR Medical Education | VOL. 9
Kevin Jia Qi Lu, et. al.Kevin Jia Qi Lu ... Fok-Han Leung
27 Jul 2023
JMIR Medical Education | VOL. 9

A Comparison of Lexicon-based and Transformer-based Sentiment Analysis on Code-mixed of Low-Resource Languages
Cuk Tho ... Widodo Budiharto
-
Cuk Tho, et. al.Cuk Tho ... Widodo Budiharto
28 Oct 2021
28 Oct 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comprehensive Study on Lexicon-based Ensemble Classification Sentiment Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy