Population-Based MCMC on Multi-Core CPUs, GPUs and FPGAs

Grigorios Mingas,Christos-Savvas Bouganis

doi:10.1109/tc.2015.2439256

Grigorios Mingas, Christos-Savvas Bouganis

Open Access

https://doi.org/10.1109/tc.2015.2439256

Copy DOI

Journal: IEEE Transactions on Computers	Publication Date: Apr 1, 2016
Citations: 10	License type: cc-by-nc-nd

Affiliation: Imperial College London

Abstract

Markov Chain Monte Carlo (MCMC) is a method to draw samples from a given probability distribution. Its frequent use for solving probabilistic inference problems, where big-scale data are repeatedly processed, means that MCMC runtimes can be unacceptably large. This paper focuses on population-based MCMC, a popular family of computationally intensive MCMC samplers; we propose novel, highly optimized accelerators in three parallel hardware platforms (multi-core CPUs, GPUs and FPGAs), in order to address the performance limitations of sequential software implementations. For each platform, we jointly exploit the nature of the underlying hardware and the special characteristics of population-based MCMC. We focus particularly on the use of custom arithmetic precision, introducing two novel methods which employ custom precision in the largest part of the algorithm in order to reduce runtime, without causing sampling errors. We apply these methods to all platforms. The FPGA accelerators are up to 114x faster than multi-core CPUs and up to 53x faster than GPUs when doing inference on mixture models.

Full Text