Feature reduction based on rough set theory is an effective feature selection method in pattern recognition applications. Finding a minimal subset of the original features is inherent in rough set approach to feature selection. As feature reduction is a Nondeterministic Polynomial‐time‐hard problem, it is necessary to develop fast optimal or near‐optimal feature selection algorithms. This article aims to propose an exact feature selection algorithm in rough set that is efficient in terms of computation time. The proposed algorithm begins the examination of a solution tree by a breadth‐first strategy. The pruned nodes are held in a version of the trie data structure. Based on the monotonic property of dependency degree, all subsets of the pruned nodes cannot be optimal solutions. Thus, by detecting these subsets in trie, it is not necessary to calculate their dependency degree. The search on the tree continues until the optimal solution is found. This algorithm is improved by selecting an initial search level determined by the hill‐climbing method instead of searching the tree from the level below the root. The length of the minimal reduct and the size of data set can influence which starting search level is more efficient. The experimental results using some of the standard UCI data sets, demonstrate that the proposed algorithm is effective and efficient for data sets with more than 30 features. © 2014 Wiley Periodicals, Inc. Complexity 20: 50–62, 2015
Read full abstract