CHAPTER 4 - Feature Selection

S THEODORIDIS

doi:10.1016/b978-0-12-374486-9.00004-x

Abstract

This chapter discusses techniques for the selection of a subset of features from a larger pool of available features. The techniques include: outlier removal, data normalization, hypothesis testing, the receiver operating characteristic curve, fisher's discriminant ratio and so on. The goal is to select those that are rich in discriminatory information with respect to the classification problem at hand. This is a crucial step in the design of any classification system, as a poor choice of features drives the classifier to perform badly. Selecting highly informative features is an attempt: to place classes in the feature space far apart from each other (large between-class distance); and to position the data points within each class close to each other (small within-class variance). Another major issue in feature selection is choosing the number of features l to be used out of an original m > l. Reducing this number is in line with our goal of avoiding overfitting to the specific training data set and of designing classifiers that result in good generalization performance—that is, classifiers that perform well when faced with data outside the training set. The choice of l depends heavily on the number of available training patterns, N. Before feature selection techniques can be used, a preprocessing stage is necessary for “housekeeping” purposes, such as removal of outlier points and data normalization.

Full Text