A modified multi objective heuristic for effective feature selection in text classification

D Thiyagarajan,N Shanthi

doi:10.1007/s10586-017-1150-7

Abstract

Text categorization is the process of sorting text documents into one or more predefined categories or classes of similar documents. Differences in the results of such categorization arise from the feature set chosen to base the association of a given document with a given category. This process is challenging mainly because there can be large number of discriminating words which render many of the current algorithms unable to complete this. For most of these tasks there exist both relevant as well as irrelevant features. The objective here is to bring about a text classification on the basis of the features selected and also pre-processing to bring down the dimensionality and increase the accuracy of classification of the feature vector. Here the most commonly used methods are meta-heuristic algorithms in order to facilitate selection. Artificial fish swarm algorithm (AFSA) takes the underlying intelligence of the behaviour of fish swarming to combat the problems of optimization as well as the combinatorial problems. This method has been greatly successful in diverse applications but does suffer from certain limitations like not having multiplicity. Therefore, a modification has been proposed to AFSA which is MAFSA that has a crossover in its operation in order to bring about an improvement in the text classification selection. SVM or Support Vector Machine, Adaboost classifiers and naive bayes are all used here. MAFSA has proved itself to be superior to AFSA in terms of precision and also the selected feature numbers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A modified multi objective heuristic for effective feature selection in text classification

Abstract

Talk to us

Similar Papers

More From: Cluster Computing

Lead the way for us

Journal: Cluster Computing	Publication Date: Oct 5, 2017
Citations: 8

Similar Papers

Information gain and divergence-based feature selection for machine learning-based text categorization
Changki Lee ... Gary Geunbae Lee
Information Processing & Management | VOL. 42
Changki Lee, et. al.Changki Lee ... Gary Geunbae Lee
03 Aug 2005
Information Processing & Management | VOL. 42

Feature selection for text classification using genetic algorithms
Noria Bidi ... Zakaria Elberrichi
-
Noria Bidi, et. al.Noria Bidi ... Zakaria Elberrichi
01 Nov 2016
01 Nov 2016

Feature selection for text classification: A review
Xuelian Deng ... Jilian Zhang
Multimedia Tools and Applications | VOL. 78
Xuelian Deng, et. al.Xuelian Deng ... Jilian Zhang
08 May 2018
Multimedia Tools and Applications | VOL. 78

Improved Gini-Index Algorithm to Correct Feature-Selection Bias in Text Classification
Heum Park ... Hyuk-Chul Kwon
IEICE Transactions on Information and Systems | VOL. E94-D
Heum Park, et. al.Heum Park ... Hyuk-Chul Kwon
01 Jan 2010
IEICE Transactions on Information and Systems | VOL. E94-D

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A modified multi objective heuristic for effective feature selection in text classification

Abstract

Talk to us

Similar Papers

More From: Cluster Computing