On the combination of evolutionary algorithms and stratified strategies for training set selection in data mining

José Ramón Cano,Manuel Lozano,Francisco Herrera

doi:10.1016/j.asoc.2005.02.006

José Ramón Cano, Manuel Lozano + Show 1 more

https://doi.org/10.1016/j.asoc.2005.02.006

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

In this paper, we present a new approach for training set selection in large size data sets. The algorithm consists on the combination of stratification and evolutionary algorithms. The stratification reduces the size of domain where the selection is applied while the evolutionary method selects the most representative instances. The performance of the proposal is compared with seven non-evolutionary algorithms, in stratified execution. The analysis follows two evaluating approaches: balance between reduction and accuracy of the subsets selected, and balance between interpretability and accuracy of the representation models associated to these subsets. The algorithms have been assessed on large and huge size data sets. The study shows that the stratified evolutionary instance selection consistently outperforms the non-evolutionary ones. The main advantages are: high instance reduction rates, high classification accuracy and models with high interpretability.

Full Text