Disease forecasting and surveillance often involve fitting models to a tremendous volume of historical testing data collected over space and time. Bayesian spatio-temporal regression models fit with Markov chain Monte Carlo (MCMC) methods are commonly used for such data. When the spatio-temporal support of the model is large, implementing an MCMC algorithm becomes a significant computational burden. This research proposes a computationally efficient gradient boosting algorithm for fitting a Bayesian spatio-temporal mixed effects binomial regression model. We demonstrate our method on a disease forecasting model and compare it to a computationally optimized MCMC approach. Both methods are used to produce monthly forecasts for Lyme disease, anaplasmosis, ehrlichiosis, and heartworm disease in domestic dogs for the contiguous United States. The data have a spatial support of 3108 counties and a temporal support of 108–138 months with 71–135 million test results. The proposed estimation approach is several orders of magnitude faster than the optimized MCMC algorithm, with a similar mean absolute prediction error.
Read full abstract