Evolutionary instance selection for text classification

Chih-Fong Tsai,Zong-Yao Chen,Shih-Wen Ke

doi:10.1016/j.jss.2013.12.034

Abstract

Text classification is usually based on constructing a model through learning from training examples to automatically classify text documents. However, as the size of text document repositories grows rapidly, the storage requirement and computational cost of model learning become higher. Instance selection is one solution to solve these limitations whose aim is to reduce the data size by filtering out noisy data from a given training dataset. In this paper, we introduce a novel algorithm for these tasks, namely a biological-based genetic algorithm (BGA). BGA fits a “biological evolution” into the evolutionary process, where the most streamlined process also complies with the reasonable rules. In other words, after long-term evolution, organisms find the most efficient way to allocate resources and evolve. Consequently, we can closely simulate the natural evolution of an algorithm, such that the algorithm will be both efficient and effective. The experimental results based on the TechTC-100 and Reuters-21578 datasets show the outperformance of BGA over five state-of-the-art algorithms. In particular, using BGA to select text documents not only results in the largest dataset reduction rate, but also requires the least computational time. Moreover, BGA can make the k-NN and SVM classifiers provide similar or slightly better classification accuracy than GA.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evolutionary instance selection for text classification

Abstract

Talk to us

Similar Papers

More From: Journal of Systems and Software

Lead the way for us

Journal: Journal of Systems and Software	Publication Date: Jan 4, 2014
Citations: 23

Similar Papers

Instance selection by genetic-based biological algorithm
Zong-Yao Chen ... Wei-Chao Lin
Soft Computing | VOL. 19
Zong-Yao Chen, et. al.Zong-Yao Chen ... Wei-Chao Lin
21 Jun 2014
Soft Computing | VOL. 19

SVOIS: Support Vector Oriented Instance Selection for text classification
Chih-Fong Tsai ... Che-Wei Chang
Information Systems | VOL. 38
Chih-Fong Tsai, et. al.Chih-Fong Tsai ... Che-Wei Chang
22 May 2013
Information Systems | VOL. 38

Towards high dimensional instance selection: An evolutionary approach
Chih-Fong Tsai ... Zong-Yao Chen
Decision Support Systems | VOL. 61
Chih-Fong Tsai, et. al.Chih-Fong Tsai ... Zong-Yao Chen
05 Feb 2014
Decision Support Systems | VOL. 61

Evolutionary feature and instance selection for traffic sign recognition
Zong-Yao Chen ... Chih-Fong Tsai
Computers in Industry | VOL. 74
Zong-Yao Chen, et. al.Zong-Yao Chen ... Chih-Fong Tsai
11 Sep 2015
Computers in Industry | VOL. 74

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evolutionary instance selection for text classification

Abstract

Talk to us

Similar Papers

More From: Journal of Systems and Software