Abstract

Feature selection refers to selecting optimal feature subset with effective data preprocessing policy in making high dimensional data for diverse pattern recognition problems. The aims of feature selection are enhancing accuracy, improving the evaluation performance, and finding the smallest effective feature subset. In this study, ensemble feature selection method is adopted based on an assumption indicating that a combination of several feature selection methods obtains more robust results than any individual feature selection method. Accordingly, when carrying out ensemble feature selection, a combinational method should be used to combine rankings of features from diverse algorithms into an individual rank for each feature. It is also required to set a threshold to acquire a functional subset of features. In this work, a three-step ensemble feature selection technique called Automatic Thresholding Feature Selection (ATFS) is proposed. The first step involves diversity generation where multiple rankers are applied to each dataset to generate different feature rankings. Second, output rankings of individual selectors are combined using fast non-dominated sorting that is a combinational method empowering the proposed ensemble with automatic thresholding capability. Third, feature sets are generated to obtain the optimal feature set. Additionally, a new filter method called Sorted Label Interference (SLI) is proposed based on interference between class labels. Both SLI and ATFS are applicable to binary datasets. The performance of SLI and ATFS is at least comparable and often better than the performance of individual rankers and existing ensemble methods. The obtained results also show that the use of ATFS-generated threshold improves not only the performance of ATFS and SLI, but also the performance of other filters and combinational methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call