Abstract

This paper deals with the problem of feature subset selection in classification-oriented datasets with a (very) large number of attributes. In such datasets complex classical wrapper approaches become intractable due to the high number of wrapper evaluations to be carried out. One way to alleviate this problem is to use the so-called filter-wrapper approach or Incremental Wrapper-based Subset Selection (IWSS), which consists of the construction of a ranking among the predictive attributes by using a filter measure, and then a wrapper approach is used by following the rank. In this way the number of wrapper evaluations is linear on the number of predictive attributes. In this paper we present two contributions to the IWSS approach. The first one is related with obtaining more compact subsets, and enables not only the addition of new attributes but also their interchange with some of those already included in the selected subset. Our second contribution, termed early stopping, sets an adaptive threshold on the number of attributes in the ranking to be considered. The advantages of these new approaches are analyzed both theoretically and experimentally. The results over a set of 12 high-dimensional datasets corroborate the success of our proposals.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call