Stochastic convex sparse principal component analysis.

Inci M Baytas,Anil K Jain,Fei Wang,Jiayu Zhou,Kaixiang Lin

doi:10.1186/s13637-016-0045-x

Abstract

Principal component analysis (PCA) is a dimensionality reduction and data analysis tool commonly used in many areas. The main idea of PCA is to represent high-dimensional data with a few representative components that capture most of the variance present in the data. However, there is an obvious disadvantage of traditional PCA when it is applied to analyze data where interpretability is important. In applications, where the features have some physical meanings, we lose the ability to interpret the principal components extracted by conventional PCA because each principal component is a linear combination of all the original features. For this reason, sparse PCA has been proposed to improve the interpretability of traditional PCA by introducing sparsity to the loading vectors of principal components. The sparse PCA can be formulated as an ℓ 1 regularized optimization problem, which can be solved by proximal gradient methods. However, these methods do not scale well because computation of the exact gradient is generally required at each iteration. Stochastic gradient framework addresses this challenge by computing an expected gradient at each iteration. Nevertheless, stochastic approaches typically have low convergence rates due to the high variance. In this paper, we propose a convex sparse principal component analysis (Cvx-SPCA), which leverages a proximal variance reduced stochastic scheme to achieve a geometric convergence rate. We further show that the convergence analysis can be significantly simplified by using a weak condition which allows a broader class of objectives to be applied. The efficiency and effectiveness of the proposed method are demonstrated on a large-scale electronic medical record cohort.

Highlights

Principal component analysis (PCA) is a commonly used dimensionality reduction and data analysis tool in many areas such as computer vision [1, 2], data mining [3, 4], biomedical informatics [5, 6], and many others.The goal of PCA is to learn a linear transformation such that the learned principal components are the dimensions retaining the most of the variance in the data
When the traditional PCA is applied to the data, those medical features are projected to a low dimensional space, in which each new feature will be the linear combination of all the original features
2.2 Optimization scheme In this paper, we propose to use a proximal stochastic gradient method with progressive variance reduction approach [15] to solve the problem in Eq (2)

Summary

Introduction

Principal component analysis (PCA) is a commonly used dimensionality reduction and data analysis tool in many areas such as computer vision [1, 2], data mining [3, 4], biomedical informatics [5, 6], and many others.The goal of PCA is to learn a linear transformation such that the learned principal components are the dimensions retaining the most of the variance in the data. In the solution of Eq (1), the principal components are linear combinations of all input variables This means that the columns of Z matrix, which are called loadings of principal components, are dense. PCA works well if we are not interested in the physical meanings of the features or if the interpretation of principal components is not crucial for the application. When the traditional PCA is applied to the data, those medical features are projected to a low dimensional space, in which each new feature will be the linear combination of all the original features. In this case, it is hard to comprehend the meaning of the new features

Results

Discussion

Conclusion