Bagging and Feature Selection for Classification with Incomplete Data

Cao Truong Tran,Peter Andreae,Bing Xue,Mengjie Zhang

doi:10.1007/978-3-319-55849-3_31

Abstract

Missing values are an unavoidable issue of many real-world datasets. Dealing with missing values is an essential requirement in classification problem, because inadequate treatment with missing values often leads to large classification errors. Some classifiers can directly work with incomplete data, but they often result in big classification errors and generate complex models. Feature selection and bagging have been successfully used to improve classification, but they are mainly applied to complete data. This paper proposes a combination of bagging and feature selection to improve classification with incomplete data. To achieve this purpose, a wrapper-based feature selection which can directly work with incomplete data is used to select suitable feature subsets for bagging. The experiments on eight incomplete datasets were designed to compare the proposed method with three other popular methods that are able to deal with incomplete data using C4.5/REPTree as classifiers and using Particle Swam Optimisation as a search technique in feature selection. Results show that the combination of bagging and feature selection can not only achieve better classification accuracy than the other methods but also generate less complex models compared to the bagging method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bagging and Feature Selection for Classification with Incomplete Data

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Improving performance for classification with incomplete data using wrapper-based feature selection
Cao Truong Tran ... Peter Andreae
Evolutionary Intelligence | VOL. 9
Cao Truong Tran, et. al.Cao Truong Tran ... Peter Andreae
09 Aug 2016
Evolutionary Intelligence | VOL. 9

Improving performance of classification on incomplete data using feature selection and clustering
Cao Truong Tran ... Lam Thu Bui
Applied Soft Computing | VOL. 73
Cao Truong Tran, et. al.Cao Truong Tran ... Lam Thu Bui
29 Sep 2018
Applied Soft Computing | VOL. 73

Classifying Incomplete Gene-Expression Data: Ensemble Learning with Non-Pre-Imputation Feature Filtering and Best-First Search Technique.
Yuanting Yan ... Tao Dai
International Journal of Molecular Sciences | VOL. 19
Yuanting Yan, et. al.Yuanting Yan ... Tao Dai
30 Oct 2018
International Journal of Molecular Sciences | VOL. 19

Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems
Lin Sun ... Shiguang Zhang
Knowledge-Based Systems | VOL. 186
Lin Sun, et. al.Lin Sun ... Shiguang Zhang
14 Aug 2019
Knowledge-Based Systems | VOL. 186

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bagging and Feature Selection for Classification with Incomplete Data

Abstract

Talk to us

Similar Papers