A Loss-Based Prior for Variable Selection in Linear Regression Methods

Cristiano Villa,Jeong Eun Lee

doi:10.1214/19-ba1162

Abstract

In this work we propose a novel model prior for variable selection in linear regression. The idea is to determine the prior mass by considering the worth of each of the regression models, given the number of possible covariates under consideration. The worth of a model consists of the information loss and the loss due to model complexity. While the information loss is determined objectively, the loss expression due to model complexity is flexible and, the penalty on model size can be even customized to include some prior knowledge. Some versions of the loss-based prior are proposed and compared empirically. Through simulation studies and real data analyses, we compare the proposed prior to the Scott and Berger prior, for noninformative scenarios, and with the Beta-Binomial prior, for informative scenarios.

Highlights

In this paper, we propose a method to derive model prior probabilities for variable selection problems in linear regression
With a prior distribution on the space of models, representing the model uncertainty related to variable selection, one way to proceed is by using Bayesian model averaging (Hoeting et al, 1999)
534A Loss-Based Prior for Variable Selection in Linear Regression Methods model posterior distribution tends to be spread across many of the possible regression models, and when prediction is an important part of the statistical analysis, Raftery et al (1997) show that Bayesian model averaging performs better than choosing the regression model with the highest posterior probability

Summary

Introduction

We propose a method to derive model prior probabilities for variable selection problems in linear regression. 534A Loss-Based Prior for Variable Selection in Linear Regression Methods model posterior distribution tends to be spread across many of the possible regression models, and when prediction is an important part of the statistical analysis, Raftery et al (1997) show that Bayesian model averaging performs better than choosing the regression model with the highest posterior probability. The fact that a regression model has been chosen to be part of the model space (i) conveys information and (ii) induces complexity; as such, we can measure the loss in information carried by a model and the loss due to its complexity These losses will form the basis to determine the worth of the model and the model prior probability.

Notation and problem specification

Model priors in objective variable selection

Model prior based on losses

Setting the constant c

Simulation study

Non-informative simulation

Informative simulation

Illustrative examples with real data sets

Hald data

Large data set analysis

Discussion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bayesian Analysis	Publication Date: Jun 27, 2019
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

A Loss-Based Prior for Variable Selection in Linear Regression Methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bayesian Analysis

Lead the way for us

Similar Papers

Bayesian Variable Selection in Linear Regression in One Pass for Large Data Sets.
Carlos Ordonez ... Veerabhadaran Baladandayuthapani
ACM transactions on knowledge discovery from data | VOL. 9
Carlos Ordonez, et. al.Carlos Ordonez ... Veerabhadaran Baladandayuthapani
25 Aug 2014
ACM transactions on knowledge discovery from data | VOL. 9

Variable selection in linear regression: Several approaches based on normalized maximum likelihood
Ciprian Doru Giurcăneanu ... Antti Liski
Signal processing | VOL. 91
Ciprian Doru Giurcăneanu, et. al.Ciprian Doru Giurcăneanu ... Antti Liski
26 Mar 2011
Signal processing | VOL. 91

Variable selection in linear regression based on ridge estimator
A V Dorugade ... D N Kashid
Journal of Statistical Computation and Simulation | VOL. 80
A V Dorugade, et. al.A V Dorugade ... D N Kashid
01 Nov 2010
Journal of Statistical Computation and Simulation | VOL. 80

Model selection using AIC in the presence of one-sided information
Anthony W Hughes ... Maxwell L King
Journal of Statistical Planning and Inference | VOL. 115
Anthony W Hughes, et. al.Anthony W Hughes ... Maxwell L King
14 May 2002
Journal of Statistical Planning and Inference | VOL. 115

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Loss-Based Prior for Variable Selection in Linear Regression Methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bayesian Analysis