Ensemble Based Classification of Sentiments Using Forest Optimization Algorithm

Mehreen Naz,Ayesha Khan,Kashif Zafar

doi:10.3390/data4020076

Abstract

Feature subset selection is a process to choose a set of relevant features from a high dimensionality dataset to improve the performance of classifiers. The meaningful words extracted from data forms a set of features for sentiment analysis. Many evolutionary algorithms, like the Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), have been applied to feature subset selection problem and computational performance can still be improved. This research presents a solution to feature subset selection problem for classification of sentiments using ensemble-based classifiers. It consists of a hybrid technique of minimum redundancy and maximum relevance (mRMR) and Forest Optimization Algorithm (FOA)-based feature selection. Ensemble-based classification is implemented to optimize the results of individual classifiers. The Forest Optimization Algorithm as a feature selection technique has been applied to various classification datasets from the UCI machine learning repository. The classifiers used for ensemble methods for UCI repository datasets are the k-Nearest Neighbor (k-NN) and Naïve Bayes (NB). For the classification of sentiments, 15–20% improvement has been recorded. The dataset used for classification of sentiments is Blitzer’s dataset consisting of reviews of electronic products. The results are further improved by ensemble of k-NN, NB, and Support Vector Machine (SVM) with an accuracy of 95% for the classification of sentiment tasks.

Highlights

Classification of sentiments is basically a technique to determine the polarity of a given text, document, or sentence
The feature subset selection process is performed by Forest Optimization Algorithm and pre-processing of features or attributes has been done by minimum redundancy and maximum relevance (mRMR) technique
The Forest Optimization Algorithm is first applied to some benchmark classification datasets and their results are compared with individual classifiers like Naïve Bayes and k-Nearest

Summary

Introduction

Classification of sentiments is basically a technique to determine the polarity of a given text, document, or sentence. Sentiment analysis and classification use both machine learning and natural language processing (NLP) techniques. Social websites have become an important source of information People share their opinions about almost everything, e.g., any product, book, movie, social or political issues, etc., on these websites. These reviews are in the form of text. The major issue that arises while gathering the data from social networking sites is that the reviews mostly contain grammatical and/or spelling mistakes and data is usually so large that correcting those mistakes is humanly impossible. Whenever the data is being extracted from any of the social networking sites it usually contains large parts of unwanted information, including html tags, as compared to actual meaningful and useful information comprising of review text. There are several pre-processing techniques to apply on the extracted data first and

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data	Publication Date: May 23, 2019
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Ensemble Based Classification of Sentiments Using Forest Optimization Algorithm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data

Lead the way for us

Similar Papers

Product Review-Based Customer Sentiment Analysis Using an Ensemble of mRMR and Forest Optimization Algorithm (FOA)
Parag Verma ... Anuj Bhardwaj
International Journal of Applied Metaheuristic Computing | VOL. 13
Parag Verma, et. al.Parag Verma ... Anuj Bhardwaj
17 Jun 2022
International Journal of Applied Metaheuristic Computing | VOL. 13

Forest optimization algorithm‐based feature selection using classifier ensemble
Usha Moorthy ... Usha Devi Gandhi
Computational Intelligence | VOL. 36
Usha Moorthy, et. al.Usha Moorthy ... Usha Devi Gandhi
17 Dec 2019
Computational Intelligence | VOL. 36

Feature Selection Techniques and Classification Accuracy of Supervised Machine Learning in Text Mining
...
Journal of Information Engineering and Applications | VOL. 9
, et. al. ...
01 May 2019
Journal of Information Engineering and Applications | VOL. 9

Digital mammogram classification using 2D-BDWT and GLCM features with FOA-based feature selection approach
Figlu Mohanty ... Banshidhar Majhi
Neural Computing and Applications | VOL. 32
Figlu Mohanty, et. al.Figlu Mohanty ... Banshidhar Majhi
13 Apr 2019
Neural Computing and Applications | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ensemble Based Classification of Sentiments Using Forest Optimization Algorithm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data