A Survey on Canonical Correlation Analysis

Weifeng Liu,Wei Liu,Xinghao Yang,Dacheng Tao

doi:10.1109/tkde.2019.2958342

Abstract

In recent years, the advances in data collection and statistical analysis promotes canonical correlation analysis (CCA) available for more advanced research. CCA is the main technique for two-set data dimensionality reduction such that the correlation between the pairwise variables in the common subspace is mutually maximized. Over 80-years of developments, a number of CCA models have been proposed according to different machine learning mechanisms. However, the field lacks an insightful review for the state-of-art developments. This survey targets to provide a well-organized overview for CCA and its extensions. Specifically, we first review the CCA theory from the perspective of both model formation and model optimization. The association between two popular solution methods, i.e., eigen value decomposition (EVD) and singular value decomposition (SVD), are discussed. Following that, we present a taxonomy of current progresses and classify them into seven groups: 1) multi-view CCA, 2) probabilistic CCA, 3) deep CCA, 4) kernel CCA, 5) discriminative CCA, 6) sparse CCA and 7) locality preserving CCA. For each group, we demonstrate two or three representative mathematical models, identifying their strengths and limitations. We summarize the representative applications and numerical results of these seven groups in real-world practices, collecting the data sets and open-sources for implementation. In the end, we provide several promising future research directions that can improve the current state of the art.

Full Text