Abstract

Feature selection is to select some useful features from candidate ones, which is one of the main methods for data dimension reduction. Because general feature selection methods are directly performed based on given data sets at hand, it is time-consuming for them to deal with large-scale data sets. To solve this issue, this paper proposes a novel feature selection method, called Q-learning with Fisher score (QLFS), for large-scale data sets. QLFS adopts the framework of Q-learning of reinforcement learning (RL) and takes Fisher score (FS), a filtering method for feature selection, as the internal reward. Here, FS is modified to calculate the ratio of the between-class and within-class distances for a feature subset instead of the ratio for a single feature. By selecting part of the training samples in each episode, QLFS can perform batch learning and then deal with large-scale data sets in batch. Experimental results on several large-scale UCI data sets show that QLFS not only improves the classification performance but also has an advantage of training speed compared with other methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call