Abstract

Fitting complex models to epidemiological data is a challenging problem: methodologies can be inaccessible to all but specialists, there may be challenges in adequately describing uncertainty in model fitting, the complex models may take a long time to run, and it can be difficult to fully capture the heterogeneity in the data. We develop an adaptive approximate Bayesian computation scheme to fit a variety of epidemiologically relevant data with minimal hyper-parameter tuning by using an adaptive tolerance scheme. We implement a novel kernel density estimation scheme to capture both dispersed and multi-dimensional data, and directly compare this technique to standard Bayesian approaches. We then apply the procedure to a complex individual-based simulation of lymphatic filariasis, a human parasitic disease. The procedure and examples are released alongside this article as an open access library, with examples to aid researchers to rapidly fit models to data. This demonstrates that an adaptive ABC scheme with a general summary and distance metric is capable of performing model fitting for a variety of epidemiological data. It also does not require significant theoretical background to use and can be made accessible to the diverse epidemiological research community.

Highlights

  • There is a trend towards greater realism using individual-based models within the ecological and epidemiological modelling community (Grimm et al, 2006; Bansal et al, 2007; DeAngelis and Grimm, 2014; Heesterbeek et al, 2015)

  • Markov chain Monte Carlo (MCMC) was directly compared to the adaptive approximate Bayesian computation (ABC) method using samples drawn from a negative binomial distribution with a range of means m and heterogeneities k (Fig. 2)

  • The ABC scheme was ran on 100 particles over 25 tolerance steps, while the MCMC scheme was ran for 10,000 steps with a burn-in period of 2000 steps and a fixed stepsize

Read more

Summary

Introduction

There is a trend towards greater realism using individual-based models within the ecological and epidemiological modelling community (Grimm et al, 2006; Bansal et al, 2007; DeAngelis and Grimm, 2014; Heesterbeek et al, 2015). Data may come in the form of multivariate time-series, such as number of diagnoses in different disease stages or different age-categories or age/risk/disease stage-stratified prevalence (Hollingsworth et al, 2008; Pullan et al, 2014). These data can be challenging to fit as it can be noisy and may not be modelled by simple distributions

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call