Abstract

Principal component analysis (PCA) is a widely used method for evaluating low-dimensional data. Some variants of PCA have been proposed to improve the interpretation of the principal components (PCs). One of the most common methods is sparse PCA which aims at finding a sparse basis to improve the interpretability over the dense basis of PCA. However, the performances of these improved methods are still far from satisfactory because the data still contain redundant PCs. In this paper, a novel method called PCA based on graph Laplacian and double sparse constraints (GDSPCA) is proposed to improve the interpretation of the PCs and consider the internal geometry of the data. In detail, GDSPCA utilizes L<sub>2,1</sub>-norm and L<sub>1</sub>-norm regularization terms simultaneously to enforce the matrix to be sparse by filtering redundant and irrelative PCs, where the L<sub>2,1</sub>-norm regularization term can produce row sparsity, while the L<sub>1</sub>-norm regularization term can enforce element sparsity. This way, we can make a better interpretation of the new PCs in low-dimensional subspace. Meanwhile, the method of GDSPCA integrates graph Laplacian into PCA to explore the geometric structure hidden in the data. A simple and effective optimization solution is provided. Extensive experiments on multi-view biological data demonstrate the feasibility and effectiveness of the proposed approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call