On Consistency and Sparsity for Principal Components Analysis in High Dimensions

Iain M Johnstone,Arthur Yu Lu

doi:10.1198/jasa.2009.0121

Abstract

Principal components analysis (PCA) is a classic method for the reduction of dimensionality of data in the form of n observations (or cases) of a vector with p variables. Contemporary datasets often have p comparable with or even much larger than n. Our main assertions, in such settings, are (a) that some initial reduction in dimensionality is desirable before applying any PCA-type search for principal modes, and (b) the initial reduction in dimensionality is best achieved by working in a basis in which the signals have a sparse representation. We describe a simple asymptotic model in which the estimate of the leading principal component vector via standard PCA is consistent if and only if p(n)/n → 0. We provide a simple algorithm for selecting a subset of coordinates with largest sample variances, and show that if PCA is done on the selected subset, then consistency is recovered, even if p(n) ≫ n.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On Consistency and Sparsity for Principal Components Analysis in High Dimensions

Abstract

Talk to us

Similar Papers

More From: Journal of the American Statistical Association

Lead the way for us

Journal: Journal of the American Statistical Association	Publication Date: Jun 1, 2009
Citations: 885

Similar Papers

Author response: Sparse dimensionality reduction approaches in Mendelian randomisation with highly correlated exposures
Vasileios Karageorgiou ... Verena Zuber
-
Vasileios Karageorgiou, et. al.Vasileios Karageorgiou ... Verena Zuber
28 Nov 2022
28 Nov 2022

Scalable Low-rank Matrix and Tensor Decomposition on Graphs

-

01 Jan 2017
01 Jan 2017

Principal component analysis based methods in bioinformatics studies
S Ma ... Y Dai
Briefings in Bioinformatics | VOL. 12
S Ma, et. al.S Ma ... Y Dai
17 Jan 2011
Briefings in Bioinformatics | VOL. 12

Comparative Analysis of Principal Components Can be Misleading.
Josef C Uyeda ... Matthew W Pennell
Systematic Biology | VOL. 64
Josef C Uyeda, et. al.Josef C Uyeda ... Matthew W Pennell
03 Apr 2015
Systematic Biology | VOL. 64

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On Consistency and Sparsity for Principal Components Analysis in High Dimensions

Abstract

Talk to us

Similar Papers

More From: Journal of the American Statistical Association