Abstract

Consider the problem of high dimensional variable selection for the Gaussian linear model when the unknown error variance is also of interest. In this paper, we show that the use of conjugate shrinkage priors for Bayesian variable selection can have detrimental consequences for such variance estimation. Such priors are often motivated by the invariance argument of Jeffreys (1961). Revisiting this work, however, we highlight a caveat that Jeffreys himself noticed; namely that biased estimators can result from inducing dependence between parameters a priori. In a similar way, we show that conjugate priors for linear regression, which induce prior dependence, can lead to such underestimation in the Bayesian high-dimensional regression setting. Following Jeffreys, we recommend as a remedy to treat regression coefficients and the error variance as independent a priori. Using such an independence prior framework, we extend the Spike-and-Slab Lasso of Rockova and George (2018) to the unknown variance case. This extended procedure outperforms both the fixed variance approach and alternative penalized likelihood methods on simulated data. On the protein activity dataset of Clyde and Parmigiani (1998), the Spike-and-Slab Lasso with unknown variance achieves lower cross-validation error than alternative penalized likelihood methods, demonstrating the gains in predictive accuracy afforded by simultaneous error variance estimation. The unknown variance implementation of the Spike-and-Slab Lasso is provided in the publicly available R package SSLASSO (Rockova and Moran, 2017).

Highlights

  • Consider the classical linear regression modelY = Xβ + ε, ε ∼ Nn(0, σ2In) (1.1)where Y ∈ Rn is a vector of responses, X = [X1, . . . , Xp] ∈ Rn×p is a fixed regression matrix of p potential predictors, β = (β1, . . . , βp)T ∈ Rp is a vector of unknown regression coefficients and ε ∈ Rn is the noise vector of independent normal random variables with σ2 as their unknown common variance.When β is sparse so that most of its elements are zero or negligible, finding the non-negligible elements of β, the so-called variable selection problem, is of particular importance

  • We show that for continuous Bayesian variable selection methods, the conjugate prior framework can result in underestimation of the error variance when: (i) the regression coefficients β are sparse; and (ii) p is of the same order as, or larger than n

  • We have shown that conjugate continuous priors for Bayesian variable selection can lead to underestimation of the error variance when (i) β is sparse; and (ii) when p is of the same order as, or larger than, n

Read more

Summary

Introduction

We show that for continuous Bayesian variable selection methods, the conjugate prior framework can result in underestimation of the error variance when: (i) the regression coefficients β are sparse; and (ii) p is of the same order as, or larger than n. Conjugate priors implicitly add p “pseudo-observations” to the posterior which can distort inference for the error variance when the true number of non-zero β is much smaller than p This is not the case for discrete component methods which adaptively reduce the size of β.

Invariance Criteria
Jeffreys Priors
Prior Considerations
The Failure of a Conjugate Prior
What About a Prior Degrees of Freedom Adjustment?
Connections with Penalized Likelihood Methods
Global-Local Shrinkage
Spike-and-Slab Lasso
Spike-and-Slab Lasso with Unknown Variance
Implementation
Scaled Spike-and-Slab Lasso
Simulation Study
Protein Activity Data
Variable Selection
Predictive Performance
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call