Abstract

Over the last century, Component Analysis (CA) methods such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Canonical Correlation Analysis (CCA), Locality Preserving Projections (LPP), and Spectral Clustering (SC) have been extensively used as a feature extraction step for modeling, classification, visualization, and clustering. CA techniques are appealing because many can be formulated as eigen-problems, offering great potential for learning linear and nonlinear representations of data in closed-form. However, the eigen-formulation often conceals important analytic and computational drawbacks of CA techniques, such as solving generalized eigen-problems with rank deficient matrices (e.g., small sample size problem), lacking intuitive interpretation of normalization factors, and understanding commonalities and differences between CA methods. This paper proposes a unified least-squares framework to formulate many CA methods. We show how PCA, LDA, CCA, LPP, SC, and its kernel and regularized extensions correspond to a particular instance of least-squares weighted kernel reduced rank regression (LS--WKRRR). The LS-WKRRR formulation of CA methods has several benefits: 1) provides a clean connection between many CA techniques and an intuitive framework to understand normalization factors; 2) yields efficient numerical schemes to solve CA techniques; 3) overcomes the small sample size problem; 4) provides a framework to easily extend CA methods. We derive weighted generalizations of PCA, LDA, SC, and CCA, and several new CA techniques.

Highlights

  • Over the last century, Component Analysis (CA) methods [1] such as Principal Component Analysis (PCA) [2], [3], Linear Discriminant Analysis (LDA) [4], [5], Canonical Correlation Analysis (CCA) [6], Laplacian Eigenmaps (LE) [7], Locality Preserving Projections (LPP) [8], and Spectral Clustering (SC) [9] have been extensively used as a feature extraction step for modeling, classification, visualization and clustering problems

  • This paper shows how Kernel PCA (KPCA), Kernel LDA (KLDA), Kernel CCA (KCCA), Normalized Cuts (Ncuts), and LE correspond to a particular instance of a least-squares weighted kernel reduced rank regression (LS-WKRRR) problem

  • This section relates the fundamental equation of CA, Eq (1), with other CA methods such as Nonnegative Matrix Factorization (NMF), Probabilistic PCA (PPCA), and Regularized LDA (RLDA), and proposes new extensions of CA methods such as Dynamic Coupled Component Analysis (DCCA), Aligned Cluster Analysis (ACA), Canonical Time Warping (CTW), Filtered Component Analysis (FCA), Parameterized Kernel Principal Component Analysis (PaKPCA), and Feature Selection for Subspace Analysis (FSSA)

Read more

Summary

A Least-Squares Framework for Component Analysis

Fernando De la Torre Member, IEEE, Abstract— Over the last century, Component Analysis (CA) methods such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Canonical Correlation Analysis (CCA), Laplacian Eigenmaps (LE), and Spectral Clustering (SC) have been extensively used as a feature extraction step for modeling, clustering, classification, and visualization. The eigen-formulation often conceals important analytic and computational drawbacks of CA techniques, such as solving generalized eigen-problems with rank deficient matrices (e.g., small sample size problem), lacking intuitive interpretation of normalization factors, and understanding commonalities and differences between CA methods. We show how PCA, LDA, CCA, LE, SC, and their kernel and regularized extensions, correspond to a particular instance of least-squares weighted kernel reduced rank regression (LS-WKRRR). The LS-WKRRR formulation of CA methods has several benefits: (1) provides a clean connection between many CA techniques and an intuitive framework to understand normalization factors; (2) yields efficient numerical schemes to solve CA techniques; (3) overcomes the small sample size problem; (4) provides a framework to extend CA methods. We derive new weighted generalizations of PCA, LDA, CCA and SC, and several novel CA techniques

INTRODUCTION
NOTATION
A GENERATIVE MODEL FOR COMPONENT ANALYSIS
Computational Aspects of LS-WKRRR
Weighted Extensions
Weighted extensions
NON-LINEAR EMBEDDING METHODS
VIII. LEAST-SQUARES EXTENSIONS OF CA METHODS
CONCLUSIONS AND FUTURE WORK
A: COVARIANCE MATRICES IN COMPONENT ANALYSIS
B: ABBREVIATIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call