Abstract

The problem of feature definition in the design of a pattern recognition system where the number of available training samples is small but the number of potential features is excessively large has not received adequate attention. Most of the existing feature extraction and feature selection procedures are not feasible due to computational considerations when the number of features exceeds, say, 100, and are not even applicable when the number of features exceeds the number of patterns. The feature definition procedure which we have proposed involves partitioning a large set of highly correlated features into subsets, or clusters, through hierarchical clustering. Almost any feature selection or extraction procedure, including the constrained maximum variance approach introduced here, can then be applied to each subset to obtain a single representative feature. The original set of correlated features is thus reduced to a small set of nearly uncorrelated features. The utility of this procedure has been demonstrated on a speaker-identification data base which consists of 20 subjects, 156 features, and 180 samples.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call