Non-Greedy L21-Norm Maximization for Principal Component Analysis.

Feiping Nie,Chris Ding,Heng Huang,Lai Tian

doi:10.1109/tip.2021.3073282

Abstract

Principal Component Analysis (PCA) is one of the most important unsupervised methods to handle high-dimensional data. However, due to the high computational complexity of its eigen-decomposition solution, it is hard to apply PCA to the large-scale data with high dimensionality, e.g., millions of data points with millions of variables. Meanwhile, the squared L2-norm based objective makes it sensitive to data outliers. In recent research, the L1-norm maximization based PCA method was proposed for efficient computation and being robust to outliers. However, this work used a greedy strategy to solve the eigenvectors. Moreover, the L1-norm maximization based objective may not be the correct robust PCA formulation, because it loses the theoretical connection to the minimization of data reconstruction error, which is one of the most important intuitions and goals of PCA. In this paper, we propose to maximize the L21-norm based robust PCA objective, which is theoretically connected to the minimization of reconstruction error. More importantly, we propose the efficient non-greedy optimization algorithms to solve our objective and the more general L21-norm maximization problem with theoretically guaranteed convergence. Experimental results on real world data sets show the effectiveness of the proposed method for principal component analysis.

Full Text