Feature Extraction and Uncorrelated Discriminant Analysis for High-Dimensional Data

Wen-Hui Yang Wen-Hui Yang,Dao-Qing Dai Dao-Qing Dai,Hong Yan Hong Yan

doi:10.1109/tkde.2007.190720

Abstract

High-dimensional data and the small sample size problem occur in many modern pattern classification applications such as face recognition and gene expression data analysis. To deal with such data, one important step is dimensionality reduction. Principal component analysis (PCA) and between-group analysis (BGA) are two commonly used methods, and various extensions of these two methods exist. The principle of these two approaches comes from their best approximation property. From a pattern recognition perspective, we show that PCA, which is based on the total scatter matrix, preserves linear separability, and BGA, which is based on between-class scatter matrix, retains the distance between class centroids. Moreover, we propose an automatic nonparameter uncorrelated discriminant analysis (UDA) algorithm based on the maximum margin criterion (MMC). The extracted features via UDA are statistically uncorrelated. UDA combines rank-preserving dimensionality reduction and constraint discriminant analysis and also serves as an effective solution for the small-sample-size problem. Experiments with face images and gene expression data sets are conducted to evaluate UDA in terms of classification accuracy and robustness.

Full Text