Intelligent Hybrid Feature Selection for Textual Sentiment Classification

Jawad Khan,Aftab Alam,Youngmoon Lee

doi:10.1109/access.2021.3118982

Jawad Khan, Aftab Alam + Show 1 more

Open Access

https://doi.org/10.1109/access.2021.3118982

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 20	License type: CC BY 4.0

Affiliation: Hamad bin Khalifa University, Hanyang University

Abstract

Sentiment Analysis (SA) aims to extract useful information from online Unstructured User-Generated Contents (UUGC) and classify them into positive and negative classes. State-of-the-art techniques for SA suffer a high dimensional feature space because of noisy and irrelevant features from the UUGC. Researchers have also proposed feature extraction and selection techniques to reduce high dimensional feature space, but they fall short in extracting and selecting the most effective sentiment features for sentiment model learning. Effective feature extraction and selection are significant for the SA because they can boost the learning algorithm’s predictive performance while reducing the high-dimensional feature space. To address these concerns, we propose an Intelligent Hybrid Feature Selection for Sentiment Analysis (IHFSSA) based on ensemble learning methods. IHFSSA first identifies sentiment features in the review text utilizing Penn Treebank part-of-speech tagset and integrated Wide Coverage Sentiment Lexicons (WCSL). The sentiment features subset is then selected employing a fast and simple rank-based ensemble of multiple filters feature selection method. The selected sentiment features are further refined by applying a wrapper-based backward feature selection method. Finally, for textual sentiment classification, the well-known classification algorithms Support Vector Machine (SVM), Naive Bayes (NB), Generalized Linear Model (GLM) are trained in the ensemble model on the refined sentiment feature set. The in-depth evaluation using heterogeneous domain benchmark datasets demonstrates that IHFSSA outperforms existing SA techniques.

Highlights

B LOGS, discussion forums, shared knowledgeseeking networks, social network platforms, and product and movie review portals [1]–[5] are only a handful of social media platforms that have come up with Web 2.0 [6], [7]
We focus on document-level Sentiment Analysis (SA), processing each sentence in the document and convert them to words employing sentence parser and tokenizer, respectively
We propose an intelligent model for textual SA based on a hybrid feature selection with ensemble learning methods

Summary

INTRODUCTION

B LOGS, discussion forums, shared knowledgeseeking networks, social network platforms, and product and movie review portals [1]–[5] are only a handful of social media platforms that have come up with Web 2.0 [6], [7]. Jing et al [15] proposed two feature selection methods called modified categorical proportional difference (MCPD) and balance category feature (BCF) that selects attributes from text reviews Their experimental results showed that the combination of BCF and MCPD methods can reduce feature space and improve the sentiment classification performance. Kalaivani et al [49] proposed machine learning-based feature selection method utilizing IG and Genetic Algorithm They applied NB, logistic regression, SVM, and ensemble techniques on multi-domain datasets and movie review datasets for evaluation. According to the literature review, different feature extraction or/and selection strategies, as well as ensemble learning methods for sentiment classification, have been introduced by researchers The technical details of the proposed methodology are elaborated in the following sub-sections

FEATURE REPRESENTATION

INTEGRATED WIDE COVERAGE SENTIMENT

12 Goto Step two

SENTIMENT FEATURES EXTRACTION

FEATURES SELECTION

WRAPPER-BASED BACKWARD FEATURE SELECTION

A4 A5 A6

CLASSIFICATION ALGORITHMS AND ENSEMBLE LEARNING METHOD

EVALUATION MEASURES

EXPERIMENTAL SETTING

PERFORMANCE ANALYSIS OF HYBRID FEATURE SELECTION APPROACH

Method

80 SVM NB GLM CE

RESULTS SUMMARY

Methods

Findings

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Intelligent Hybrid Feature Selection for Textual Sentiment Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Sentimental feature selection for sentiment analysis of Chinese online reviews
Lijuan Zheng ... Hongwei Wang
International Journal of Machine Learning and Cybernetics | VOL. 9
Lijuan Zheng, et. al.Lijuan Zheng ... Hongwei Wang
19 Mar 2015
International Journal of Machine Learning and Cybernetics | VOL. 9

A new feature selection method on classification of medical datasets: Kernel F-score feature selection
Kemal Polat ... Salih Güneş
Expert Systems with Applications | VOL. 36
Kemal Polat, et. al.Kemal Polat ... Salih Güneş
31 Jan 2009
Expert Systems with Applications | VOL. 36

Stock daily return prediction using expanded features and feature selection
Hakan Gündüz ... Zehra Çataltepe
TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES | VOL. 25
Hakan Gündüz, et. al.Hakan Gündüz ... Zehra Çataltepe
01 Jan 2017
TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES | VOL. 25

Feature Selection Using New Version of V-Shaped Transfer Function for Salp Swarm Algorithm in Sentiment Analysis
Dinar Ajeng Kristiyanti ... Annisa Annisa
Computation | VOL. 11
Dinar Ajeng Kristiyanti, et. al.Dinar Ajeng Kristiyanti ... Annisa Annisa
08 Mar 2023
Computation | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Intelligent Hybrid Feature Selection for Textual Sentiment Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access