Abstract

Feature selection is a pre-processing step for train- ing of classifiers in order to improve their performance. Rough Set Feature Selection (RSFS) is a novel feature selection ap- proach. RSFS removes the redundant attributes only while keeping all the important ones that preserve the classification power of the original dataset. The feature subsets selected by RSFS are called reducts. The intersection of all reducts is called the core. This paper investigates the effect of RSFS on the performance of decision trees in terms of classification accuracy and number of tree nodes. 9 datasets from different domains are used. For all datasets, there exists at least 1 reduct improving the performance of decision trees and the minimal reduct is not the best-quality reduct in improving decision tree performance. The effect of RSFS on the performance of the decision trees is shown to be related to the ratio of core size to dataset dimensionality. The core size is shown to be determined by the presence of pairs of core-determining objects within the dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call