Abstract

Canonical correlation analysis (CCA) is a famous data analysis method that has been successfully used in many areas. CCA extracts meaningful information from a pair of data sets, by seeking pairs of linear combinations from two sets of variables with maximum correlation. Mathematically, CCA resorts to solving a large-scale generalized eigenvalue problem. However, as the dimension of the data sets is much larger than the number of samples, CCA may suffer from the small-sample-size (SSS) problem and the over-fitting problem. In order to overcome these difficulties, the regularized technique is often applied, but it is difficult to choose the optimal parameter in advance. In this work, we propose an Exponential Canonical Correlation Analysis (ECCA) method based on matrix exponential, which is parameter-free and can overcome the over-fitting and the SSS problems fundamentally. However, the computational overhead of the ECCA method is very high in practice. Based on the randomized singular value decomposition (RSVD), we then propose a Randomized Exponential Canonical Correlation Analysis (RECCA) method for data analysis and dimensionality reduction. Theoretical results are given to show the rationality of this randomized method, and establish the relationship between RECCA and ECCA. Numerical experiments are performed on some real-world, high-dimensional and large-sample data sets, which illustrate the superiority of the proposed algorithms over many state-of-the-art CCA algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call