Abstract

Summary The availability of datasets with large numbers of variables is rapidly increasing. The effective application of Bayesian variable selection methods for regression with these datasets has proved difficult since available Markov chain Monte Carlo methods do not perform well in typical problem sizes of interest. We propose new adaptive Markov chain Monte Carlo algorithms to address this shortcoming. The adaptive design of these algorithms exploits the observation that in large-$p$, small-$n$ settings, the majority of the $p$ variables will be approximately uncorrelated a posteriori. The algorithms adaptively build suitable nonlocal proposals that result in moves with squared jumping distance significantly larger than standard methods. Their performance is studied empirically in high-dimensional problems and speed-ups of up to four orders of magnitude are observed.

Highlights

  • The availability of large data sets has led to an increasing interest in variable selection methods applied to regression models with many potential variables but few observations, so-called large 25 p, small n problems

  • Markov chain Monte Carlo methods are typically used to sample from the posterior distribution (George and McCulloch, 1997; O’Hara and Sillanpaa, 2009; Clyde et al, 2011)

  • The exploratory individual adaptation algorithm is described in Algorithm 1 and we indicate its transition kernel at time i as PηE(IiA)

Read more

Summary

SUMMARY

The availability of data sets with large numbers of variables is rapidly increasing. The effective application of Bayesian variable selection methods for regression with these data sets has proved difficult since available Markov chain Monte Carlo methods do not perform well in typical problem sizes of interest. The algorithms adaptively build suitable non-local proposals that result in moves with squared jumping distance significantly larger than standard methods. Their performance is studied empirically in high-dimensional problems and speedups of up to 4 orders of magnitude are observed. Some key words: variable selection; spike-and-slab priors; high-dimensional data; large p, small n problems; linear regression: expected squared jumping distance; optimal scaling

INTRODUCTION
DESIGN OF THE ADAPTIVE SAMPLERS
ERGODICITY OF THE ALGORITHMS
RESULTS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call