Abstract
In this paper we propose three novel feature ranking methods for supervised feature selection in the context of classification which are based on possibility theory. All three methods – nonspecificity, strife and total uncertainty – are tested on eight artificial data sets and ten medical real-world data sets and benchmarked against ReliefF, the Fisher score, the fuzzy entropy and similarity (FES), the Fuzzy similarity and entropy (FSAE) filter, symmetrical uncertainty as well as using no feature selection. The feature ranking methods were applied following two approaches: (1) using a fixed threshold for the number of highest-ranking features selected and (2) using a hybrid feature selection approach with a classifier (k-nearest neighbor classifier, decision tree, similarity classifier, SVM) to select the optimal number of features to select. The results indicate that strife and the Fisher score are the two feature ranking methods that for both approaches are on average ranked the highest in terms of the test set accuracy on the real-world data sets. Besides that, for the hybrid approach, strife uses most of the time a considerably smaller number of features than nonspecificity and total uncertainty. In terms of stability, which was measured with the adjusted stability measure (ASM), the Fisher score and strife were among the most stable feature ranking methods in this study. Additionally, strife’s feature subsets were diverse compared to those of the remaining feature selection methods, making it a good candidate to be included in a feature selection ensemble.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Engineering Applications of Artificial Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.