An Efficient Prototype Selection Algorithm Based on Dense Spatial Partitions

Joel Luís Carbonera,Mara Abel

doi:10.1007/978-3-319-91262-2_26

Abstract

In order to deal with big data, techniques for prototype selection have been applied for reducing the computational resources that are necessary to apply data mining approaches. However, most of the proposed approaches for prototype selection have a high time complexity and, due to this, they cannot be applied for dealing with big data. In this paper, we propose an efficient approach for prototype selection. It adopts the notion of spatial partition for efficiently dividing the dataset in sets of similar instances. In a second step, the algorithm extracts a prototype of each of the densest spatial partitions that were previously identified. The approach was evaluated on 15 well-known datasets used in a classification task, and its performance was compared to those of 6 state-of-the-art algorithms, considering two measures: accuracy and reduction. All the obtained results show that, in general, the proposed approach provides a good trade-off between accuracy and reduction, with a significantly lower running time, when compared with other approaches.

Full Text