Abstract

A lot of uncertainty is generally associated with the micro-blog content, primarily due to the presence of noisy, heterogeneous, structured or unstructured data which may be high-dimensional, ambiguous, vague or imprecise. This makes feature engineering for predicting the sentiment arduous and challenging. Population-based meta-heuristics, especially the ones inspired by nature have been proposed in various pertinent studies for feature selection because of their probability to accept a less optimal solution and averting being stuck in local optimal solutions. This research demonstrates the use of two such swarm intelligence algorithms, namely, binary grey wolf and binary moth flame for feature optimization to enhance the sentiment classification performance accuracy. The study is conducted on tweets from two benchmark Twitter corpus (SemEval 2016 and SemEval 2017) and is initially analyzed using the conventional term frequency-inverse document frequency statistical weighting filter for feature extraction and subsequently using the swarm-based algorithms. The features are trained over five baseline classifiers namely, the Naive Bayesian, support vector machines, k-nearest neighbor, multilayer perceptron and decision tree. The results validate that the population-based meta-heuristic algorithms for feature subset selection outperform the baseline supervised learning algorithms. For the binary grey wolf algorithm, an average improvement of 9.4% in accuracy is observed with an approximate 20.5% average reduction in features. Also, for the binary moth flame algorithm, an average accuracy improvement of 10.6% is observed with an approximate 40% average reduction in features. The highest accuracy of 76.5% is observed for support vector machine with binary grey wolf optimizer on SemEval 2016 benchmark dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.