Abstract
Many high dimensional data mining problems can be formulated as minimizing an empirical loss function with a penalty proportional to the number of variables required to describe a model. We propose a graduated non-convexification method to facilitate tracking of a global minimizer of this problem. We prove that under some conditions the proposed regularization problem using the continuous piecewise linear approximation is equivalent to the original 0 l regularization problem. In addition, a family of graduated nonconvex approximations are proposed to approximate its 1 l continuous approximation. Computational results are presented to illustrate the performance.
Highlights
Sparsity is a desired property in model estimation since it often leads to better interpretability and out-of-sample predictability
Sparse model estimation is sometimes referred to as variable selection
We generate random sparse model selection problems based on least squares data fitting problems below: GNC1 Algorithm
Summary
Sparsity is a desired property in model estimation since it often leads to better interpretability and out-of-sample predictability. Selecting a model with a small number variables can be formulated as minimizing an empirical loss function with a penalization for the number of nonzero variables; this is referred to as l0 -regularization. This is a NP-hard global optimization problem, see, e.g., [2] [3]. Due to its computational simplicity, regularization based on the l2 norm is popular in practice This is referred to as ridge regression
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have