GPU Accelerated Parallel Processing for Large-Scale Monte Carlo Analysis: COVID-19 Parameter Estimation and New Case Forecasting

Brad Suchoski,Heidi Gurung,Steve Stage,Prasith Baccam

doi:10.3389/fams.2022.818016

Abstract

Markov Chain Monte Carlo methods have emerged as one of the premier approaches to estimating posterior distributions for use in Bayesian computations. Unfortunately, these methods often suffer from slow run times when the data become large or when the parameter values come from complex distributions. This speed issue has prevented MCMC analysis from being used to solve some of the most interesting problems for which its technique is a good fit. We used the Multiple-Try Metropolis variant of the basic Metropolis Hastings algorithm, which trades off running more parallel likelihood calculations in favor of a higher acceptance rate and faster convergence compared to traditional MCMC. We optimized our algorithm to parallelize it and to take advantage of GPU processing. We applied our approach to parameter estimation for a Susceptible-Exposed-Infectious-Removed (SEIR) model and forecasting new cases of COVID-19. In comparison to a fully parallelized CPU implementation, using a single GPU to execute the simulations resulted in more than a 13x speedup in wall clock time, running on multiple GPUs resulted in a 36.3x speedup in wall clock time, and using a cloud-based server consisting of 8 GPUs resulted in a 56.5x speedup in wall clock time. Our approach shows that MCMC methods can be utilized to tackle problems that were previously thought to be too computationally intensive and slow.

Highlights

This paper explores the use of computer algorithm optimization and parallelization to accelerate large-scale Markov Chain Monte Carlo (MCMC) analyses which were applied to parameter estimation for a Susceptible-Exposed-Infectious-Removed (SEIR) model and forecasting new cases of COVID-19
The key accomplishment of this project was the application of optimization and parallelization techniques to speed up the MCMC analysis to the point that it could be used in a large, real-world situation, demonstrating that theoretical
MCMC methods have emerged as one of the premier approaches to estimating posterior distributions for use in Bayesian computations, which have been useful in a number of fields, including machine learning [1], physics [2], and systems biology [3]

Summary

Introduction

This paper explores the use of computer algorithm optimization and parallelization to accelerate large-scale Markov Chain Monte Carlo (MCMC) analyses which were applied to parameter estimation for a Susceptible-Exposed-Infectious-Removed (SEIR) model and forecasting new cases of COVID-19. A large quantity of simulations was run to estimate the parameter values in the SEIR model and increase their accuracy through optimization via graphics processing unit (GPU) processing and parallelization of both the likelihood function and multiple MCMC chains using a multiple-try Metropolis (MTM) MCMC algorithm. MCMC methods have emerged as one of the premier approaches to estimating posterior distributions for use in Bayesian computations, which have been useful in a number of fields, including machine learning [1], physics [2], and systems biology [3]. These methods often suffer from slow run times when the data become large or when the parameter values come from complex distributions. While parallelizing computations will inevitably reduce computational time, we have sought to increase speed further by optimizing the code to leverage hardware advances to decrease the time it takes for analysis by running on GPUs

Methods

Results

Conclusion