Variable Selection Diagnostics Measures for High-Dimensional Regression

Ying Nan,Yuhong Yang

doi:10.1080/10618600.2013.829780

Abstract

Many exciting results have been obtained on model selection for high-dimensional data in both efficient algorithms and theoretical developments. The powerful penalized regression methods can give sparse representations of the data even when the number of predictors is much larger than the sample size. One important question then is: How do we know when a sparse pattern identified by such a method is reliable? In this work, besides investigating instability of model selection methods in terms of variable selection, we propose variable selection deviation measures that give one a proper sense on how many predictors in the selected set are likely trustworthy in certain aspects. Simulation and a real data example demonstrate the utility of these measures for application.

Full Text