Abstract

Streaming feature selection (SFS), is the task of selecting the most informative features in dealing with high-dimensional or incrementally growing problems. Several SFS algorithms have been proposed in the literature. However, they do not consider all feature subsets at the redundancy analysis step due to computational concerns. Moreover, they do not reconsider previously removed features which leads to losing most of the useful information. In this paper, the redundancy analysis step is defined as a binary optimization problem. Then, a binary bat algorithm (BBA) is adopted to find the minimal informative subsets. In this way, a large number of feature subsets can be considered effectively at the redundancy analysis step. In addition, an effective priority list is used to maintain previously removed redundant features. Such a list allows the re-examination of informative features. As a result, it is possible to consider the mutual information between features that are not streamed in an small time interval. Experimental studies on fifteen different types of datasets show that our approach is superior to state-of-the-art online and offline streaming feature selection methods in terms of classification accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call