Abstract

SummaryWe introduce a new method for sparse principal component analysis, based on the aggregation of eigenvector information from carefully selected axis-aligned random projections of the sample covariance matrix. Unlike most alternative approaches, our algorithm is non-iterative, so it is not vulnerable to a bad choice of initialization. We provide theoretical guarantees under which our principal subspace estimator can attain the minimax optimal rate of convergence in polynomial time. In addition, our theory provides a more refined understanding of the statistical and computational trade-off in the problem of sparse principal component estimation, revealing a subtle interplay between the effective sample size and the number of random projections that are required to achieve the minimax optimal rate. Numerical studies provide further insight into the procedure and confirm its highly competitive finite sample performance.

Highlights

  • Principal component analysis (PCA) is one of the most widely used techniques for dimensionality reduction in statistics, image processing and many other fields

  • Despite its successes and enormous popularity, it has been well known for a decade or more that PCA breaks down as soon as the dimensionality p of the data is of the same order as the sample size n

  • From a theoretical point of view, our algorithm provides a new perspective on the statistical and computational trade-off that is involved in the sparse principal component analysis (SPCA) problem

Read more

Summary

Introduction

Principal component analysis (PCA) is one of the most widely used techniques for dimensionality reduction in statistics, image processing and many other fields. In the simplest setting where we seek a single, univariate projection of our data, we may estimate this optimal direction by computing the leading eigenvector of the sample covariance matrix. This method is guaranteed to converge from any initialization and so does not suffer the same poor performance as mentioned above. For a, b ∈ R, we write a b to mean that there is a universal constant C > 0 such that a Cb

Sparse principal component analysis via random projections
Theoretical guarantees
Numerical experiments
First principal component
Findings
Proof of corollary 1
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.