Model selection strategies for identifying most relevant covariates in homoscedastic linear models

Aleksey Min,Hajo Holzmann,Claudia Czado

doi:10.1016/j.csda.2009.09.006

Abstract

A new method in two variations for the identification of most relevant covariates in linear models with homoscedastic errors is proposed. In contrast to many known selection criteria, the method is based on an interpretable scaled quantity. This quantity measures a maximal relative error one makes by selecting covariates from a given set of all available covariates. The proposed model selection procedures rely on asymptotic normality of test statistics, and therefore normality of the errors in the regression model is not required. In a simulation study the performance of the suggested methods along with the performance of the standard model selection criteria AIC, BIC, Lasso and relaxed Lasso is examined. The simulation study illustrates the favorable performance of the proposed method as compared to the above reference criteria, especially when regression effects possess influence of several orders in magnitude. The accuracy of the normal approximation to the test statistics is also investigated; it has been already satisfactory for sample sizes 50 and 100. As an illustration the US college spending data from 1994 is analyzed.

Full Text