Cluster-oriented instance selection for classification problems

Soumitra Saha,Partho Sarathi Sarker,Alam Al Saud,Swakkhar Shatabda,M.A Hakim Newton

doi:10.1016/j.ins.2022.04.036

Abstract

More training instances could lead to better classification accuracy. However, accuracy could also degrade if more training instances mean further noises and outliers. Additional training instances arguably need additional computational resources in future data mining operations. Instance selection algorithms identify subsets of training instances that could desirably increase accuracy or at least do not decrease accuracy significantly. There exist many instance selection algorithms, but no single algorithm, in general, dominates the others. Moreover, existing instance selection algorithms do not allow direct controlling of the instance selection rate. In this paper, we present a simple and generic cluster-oriented instance selection algorithm for classification problems. Our proposed algorithm runs an unsupervised K Means Clustering algorithm on the training instances and with a given selection rate, selects instances from the centers and the borders of the clusters. On 24 benchmark classification problems, when very similar percentages of instances are selected by various instance selection algorithms, K Nearest Neighbours classifiers achieve more than 2%–3% better accuracy when using instances selected by our proposed method than when using those selected by other state-of-the-art generic instance selection algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cluster-oriented instance selection for classification problems

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Journal: Information Sciences	Publication Date: Apr 21, 2022
Citations: 19

Similar Papers

Boosting instance selection algorithms
Nicolás García-Pedrajas ... Aida De Haro-García
Knowledge-Based Systems | VOL. 67
Nicolás García-Pedrajas, et. al.Nicolás García-Pedrajas ... Aida De Haro-García
09 May 2014
Knowledge-Based Systems | VOL. 67

Instances selection algorithm by ensemble margin
Meryem Saidi ... Mohamed Amine Chikh
Journal of Experimental & Theoretical Artificial Intelligence | VOL. 30
Meryem Saidi, et. al.Meryem Saidi ... Mohamed Amine Chikh
05 Dec 2017
Journal of Experimental & Theoretical Artificial Intelligence | VOL. 30

A fast and flexible instance selection algorithm adapted to non-trivial database sizes
Frédéric Ros ... Rachid Harba
Intelligent Data Analysis | VOL. 19
Frédéric Ros, et. al.Frédéric Ros ... Rachid Harba
09 Jun 2015
Intelligent Data Analysis | VOL. 19

Constructing Ensembles of Classifiers by Means of Weighted Instance Selection
N Garcia-Pedrajas
IEEE Transactions on Neural Networks | VOL. 20
N Garcia-PedrajasN Garcia-Pedrajas
27 Jan 2009
IEEE Transactions on Neural Networks | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cluster-oriented instance selection for classification problems

Abstract

Talk to us

Similar Papers

More From: Information Sciences