Efficient model selection for regularized linear discriminant analysis

Jieping Ye,Ravi Janardan,Qi Li,Jinbo Bi,Chandra Kambhamettu,Vladimir Cherkassky,Tao Xiong

doi:10.1145/1183614.1183691

Abstract

Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a regularization term, which is commonly estimated via cross-validation from a set of candidates. However, cross-validation may be computationally prohibitive when the candidate set is large. An efficient algorithm for RLDA is presented that computes the optimal transformation of RLDA for a large set of parameter candidates, with approximately the same cost as running RLDA a small number of times. Thus it facilitates efficient model selection for RLDA.An intrinsic relationship between RLDA and Uncorrelated LDA (ULDA), which was recently proposed for dimension reduction and classification is presented. More specifically, RLDA is shown to approach ULDA when the regularization value tends to zero. That is, RLDA without any regularization is equivalent to ULDA. It can be further shown that ULDA maps all data points from the same class to a common point, under a mild condition which has been shown to hold for many high-dimensional datasets. This leads to the overfitting problem in ULDA, which has been observed in several applications. Thetheoretical analysis presented provides further justification for the use of regularization in RLDA. Extensive experiments confirm the claimed theoretical estimate of efficiency. Experiments also show that, for a properly chosen regularization parameter, RLDA performs favorably in classification, in comparison with ULDA, as well as other existing LDA-based algorithms and Support Vector Machines (SVM).

Full Text