A Graduated Nonconvex Regularization for Sparse High Dimensional Model Estimation

Thomas F Coleman,Yuying Li

doi:10.4236/jcc.2014.211001

Abstract

Many high dimensional data mining problems can be formulated as minimizing an empirical loss function with a penalty proportional to the number of variables required to describe a model. We propose a graduated non-convexification method to facilitate tracking of a global minimizer of this problem. We prove that under some conditions the proposed regularization problem using the continuous piecewise linear approximation is equivalent to the original 0 l regularization problem. In addition, a family of graduated nonconvex approximations are proposed to approximate its 1 l continuous approximation. Computational results are presented to illustrate the performance.

Highlights

Sparsity is a desired property in model estimation since it often leads to better interpretability and out-of-sample predictability
Sparse model estimation is sometimes referred to as variable selection
We generate random sparse model selection problems based on least squares data fitting problems below: GNC1 Algorithm

Summary

Introduction

Sparsity is a desired property in model estimation since it often leads to better interpretability and out-of-sample predictability. Selecting a model with a small number variables can be formulated as minimizing an empirical loss function with a penalization for the number of nonzero variables; this is referred to as l0 -regularization. This is a NP-hard global optimization problem, see, e.g., [2] [3]. Due to its computational simplicity, regularization based on the l2 norm is popular in practice This is referred to as ridge regression

Objectives

Results

Conclusion