A Novel Genetic Algorithm Approach to Simultaneous Feature Selection and Instance Selection

Inti Mateus Resende Albuquerque,Bach Hoai Nguyen,Bing Xue,Mengjie Zhang

doi:10.1109/ssci47803.2020.9308307

Abstract

With advancements in technology, the amount of collected data has been sharply increased. Most of the existing learning algorithms do not perform well on such huge datasets due to useless data such as irrelevant/redundant features and noisy instances. Feature selection and instance selection are two common data reduction approaches to address this issue. However, most of existing data reduction algorithms focus solely on either feature selection or instance selection. In this paper, based on a Genetic Algorithm, we propose a novel data reduction algorithm that can perform feature selection and instance selection simultaneously, which considers the complementary interaction between selected features and instances. The proposed algorithm is examined on nine real-world datasets with varying difficulties. The experimental results show that the proposed algorithm can successfully reduce both the number of features and the number of instances while maintaining or even improving the learning performance over using the original data.

Full Text