AbstractPrincipal component analysis (PCA) has been a widely used technique for dimension reduction while retaining essential information. However, the ordinary PCA lacks interpretability, especially when dealing with large scale data. To address this limitation, sparse PCA (SPCA) has emerged as an interpretable variant of ordinary PCA. However, the ordinary SPCA relies on solving a challenging non-convex discrete optimization problem, which maximizes explained variance while constraining the number of non-zero elements in each principal component. In this paper, we propose an innovative least angle SPCA technique to address the computational complexity associated with SPCA, particularly in ultrahigh dimensional data, by sequentially identifying sparse principal components with minimal angles to their corresponding components extracted through ordinary PCA. This sequential identification enables solving the optimization problem in polynomial time, significantly reducing computational challenges. Despite its efficiency gains, our proposed method also preserves the main attributes of SPCA. Through comprehensive experimental results, we demonstrate advantages of our approach as a viable alternative for dealing with the computational difficulties inherent in ordinary SPCA. Notably, our method emerges as an efficient and effective solution for conducting ultrahigh dimensional data analysis, enabling researchers to extract meaningful insights and streamline data interpretation.
Read full abstract