A Review of Feature Selection Techniques in Sentiment Analysis Using Filter, Wrapper, or Hybrid Methods

Pulung Hendro Prastyo,Risanuri Hidayat,Igi Ardiyanto

doi:10.1109/icst50505.2020.9732885

Abstract

Sentiment analysis is one of the text mining fields that classify the polarity of document texts and determine positive, neutral, or negative opinions. Document texts tend to have noise features or irrelevant features, so that feature selection is needed to overcome the problems. The feature selection is a challenge in sentiment analysis to produce accurate models. It is crucial for improving machine learning algorithms because it can reduce the dimensionality of feature space, remove irrelevant features, select valuable features, and increase learning accuracy. Therefore, this study focuses on reviewing feature selection techniques classified into three categories, such as filter, wrapper, and hybrid methods. The review results concluded that all feature selection techniques could select essential features, reduce the dimensionality of feature space, and improve the accuracy of machine learning algorithms. Filter methods are easy to implement and faster than wrapper and hybrid methods, whereas wrapper methods are better than filter methods in terms of accuracy but slower than filter methods. The hybrid techniques are the best feature selection method to resolve redundant and irrelevant data and increase the classifier's performance. However, hybrid methods are complicated. Thus, they need a high computational cost.

Full Text