Abstract

Gibbs sampling is a widely used Markov chain Monte Carlo (MCMC) method for numerically approximating integrals of interest in Bayesian statistics and other mathematical sciences. Many implementations of MCMC methods do not extend easily to parallel computing environments, as their inherently sequential nature incurs a large synchronization cost. In the case study illustrated by this paper, we show how to do Gibbs sampling in a fully data-parallel manner on a graphics processing unit, for a large class of exchangeable models that admit latent variable representations. Our approach takes a systems perspective, with emphasis placed on efficient use of compute hardware. We demonstrate our method on a Horseshoe Probit regression model and find that our implementation scales effectively to thousands of predictors and millions of data points simultaneously.

Highlights

  • The Bayesian statistical paradigm has a variety of desirable properties

  • We present a case study of a way to implement Markov chain Monte Carlo (MCMC) for a large class of Bayesian models that admit exchangeable likelihoods with latent variable representations

  • Preliminary descriptive analysis narrowed the available independent variables down to a set of p = 141 interesting predictors—where interesting was determined according to signal-to-noise ratios in maximum-likelihood estimation— and we used our algorithm to fit a Horseshoe Probit regression model to the resulting data set, to see how many of the interesting predictors survived the regularization process imposed by the Horseshoe prior

Read more

Summary

Introduction

The Bayesian statistical paradigm has a variety of desirable properties It accounts for the uncertainty inherent in statistical inference by producing a posterior distribution, which fundamentally contains more information about the unknown quantities of interest than a point estimate. In the sections that follow, we describe GPUs, characterize models in which this approach is usable, and demonstrate the method on a Horseshoe Probit model with N = 1,000,000 and p = 1000. Standard computation with such N may take O (days)—the method we describe runs in O (minutes). Statistics and Computing (2019) 29:301–310 menting GPU Gibbs sampling, in the context of a Horseshoe Probit regression model

Previous work
Review and description of GPUs
Exchangeable models
Case study
Synthetic data
Real data
Findings
Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.