AbstractWe study a variant of the sparse PCA (principal component analysis) problem in the “hard” regime, where the inference task is possible yet no polynomial‐time algorithm is known to exist. Prior work, based on the low‐degree likelihood ratio, has conjectured a precise expression for the best possible (subexponential) runtime throughout the hard regime. Following instead a statistical physics‐inspired point of view, we show bounds on the depth of free energy wells for various Gibbs measures naturally associated to the problem. These free energy wells imply hitting time lower bounds that corroborate the low‐degree conjecture: we show that a class of natural MCMC (Markov chain Monte Carlo) methods (with worst‐case initialization) cannot solve sparse PCA with less than the conjectured runtime. These lower bounds apply to a wide range of values for two tuning parameters: temperature and sparsity misparametrization. Finally, we prove that the overlap gap property (OGP), a structural property that implies failure of certain local search algorithms, holds in a significant part of the hard regime. © 2022 Wiley Periodicals, Inc.
Read full abstract