Principal component analysis (PCA) is a widely used statistical technique for dimensionality reduction, extracting a low-dimensional subspace in which the variance is maximised (or the reconstruction error is minimised). To improve the interpretability of learned representations, several variants of PCA have recently been developed to estimate the principal components with a small number of input features (variable), such as sparse PCA and group sparse PCA. However, most existing methods suffer from either the requirement of measuring all the input variables or redundancy in the set of selected features. Another challenge for these methods is that they need to specify the sparsity level of the coefficient matrix in advance. To address the above issues, in this paper, we propose an elastic-net regularisation for sparse group PCA (ESGPCA), which incorporates sparsity constraints into the objective function to consider both within-group and between-group sparsities. Such a sparse learning approach allows us to automatically discover the sparse principal loading vectors without any prior assumption of the input features. We solve the non-smooth regularised problem using the alternating direction method of multipliers (ADMM), an efficient distributed optimisation technique. Empirical evaluations on both synthetic and real datasets demonstrate the effectiveness and promising performance of our sparse group PCA than other compared methods.
Read full abstract