Abstract
Linear discriminant analysis (LDA) is one of the most popular approaches for feature extraction and dimension reduction to overcome the curse of the dimensionality of the high-dimensional data in many applications of data mining, machine learning, and bioinformatics. In this paper, we made two main contributions to an important LDA scheme, the generalized Foley–Sammon transform (GFST) [Foley and Sammon, IEEE Trans. Comput., 24 (1975), pp. 281–289; Guo et al., Pattern Recognition Lett., 24 (2003), pp. 147–158] or a trace ratio model [Wang et al., Proceedings of the International Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8] and its regularized GFST (RGFST), which handles the undersampled problem that involves small samples size n, but with high number of features N ($N>n$) and arises frequently in many modern applications. Our first main result is to establish an equivalent reduced model for the RGFST which effectively improves the computational overhead. The iteration method proposed by Wang et al. is applied to solve the GFST or the reduced RGFST. It has been proven by Wang et al. that this iteration converges globally and fast convergence was observed numerically, but there is no theoretical analysis on the convergence rate thus far. Our second main contribution completes this important and missing piece by proving the quadratic convergence even under two kinds of inexact computations. Practical implementations, including computational complexity and storage requirements, are also discussed. Our experimental results on several real world data sets indicate the efficiency of the algorithm and the advantages of the GFST model in classification.
Paper version not known (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have