Selecting explanatory variables with the modified version of the Bayesian information criterion

Małgorzata Bogdan,Jayanta K. Ghosh,Małgorzata Żak‐Szatkowska

doi:10.1002/qre.936

Abstract

AbstractWe consider the situation in which a large database needs to be analyzed to identify a few important predictors of a given quantitative response variable. There is a lot of evidence that in this case classical model selection criteria, such as the Akaike information criterion or the Bayesian information criterion (BIC), have a strong tendency to overestimate the number of regressors. In our earlier papers, we developed the modified version of BIC (mBIC), which enables the incorporation of prior knowledge on a number of regressors and prevents overestimation. In this article, we review earlier results on mBIC and discuss the relationship of this criterion to the well‐known Bonferroni correction for multiple testing and the Bayes oracle, which minimizes the expected costs of inference. We use computer simulations and a real data analysis to illustrate the performance of the original mBIC and its rank version, which is designed to deal with data that contain some outlying observations. Copyright © 2008 John Wiley & Sons, Ltd.

Full Text