Abstract
The problem of feature definition in the design of a pattern recognition system where the number of available training samples is small but the number of potential features is excessively large has not received adequate attention. Most of the existing feature extraction and feature selection procedures are not feasible due to computational considerations when the number of features exceeds, say, 100, and are not even applicable when the number of features exceeds the number of patterns. The feature definition procedure which we have proposed involves partitioning a large set of highly correlated features into subsets, or clusters, through hierarchical clustering. Almost any feature selection or extraction procedure, including the constrained maximum variance approach introduced here, can then be applied to each subset to obtain a single representative feature. The original set of correlated features is thus reduced to a small set of nearly uncorrelated features. The utility of this procedure has been demonstrated on a speaker-identification data base which consists of 20 subjects, 156 features, and 180 samples.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.