Abstract

ABCpy is a highly modular scientific library for Approximate Bayesian Computation (ABC) written in Python. The main contribution of this paper is to document a software engineering effort that enables domain scientists to easily apply ABC to their research without being ABC experts; using ABCpy they can easily run large parallel simulations without much knowledge about parallelization. Further, ABCpy enables ABC experts to easily develop new inference schemes and evaluate them in a standardized environment and to extend the library with new algorithms. These benefits come mainly from the modularity of ABCpy. We give an overview of the design of ABCpy and provide a performance evaluation concentrating on parallelization. This points us towards the inherent imbalance in some of the ABC algorithms. We develop a dynamic scheduling MPI implementation to mitigate this issue and evaluate the various ABC algorithms according to their adaptability towards high-performance computing.

Highlights

  • Today, computers are used to simulate different aspects of nature

  • In Algorithm 1, we provide a description of the Population Monte Carlo ABC (PMCABC) algorithm, which we will use in the following to illustrate the idea of approximate Bayesian computation (ABC) algorithms and their parallelization

  • We conclude that the performance of APMCABC and SABC is significantly better compared to PMCABC due to the absence of imbalance in them and are better suited for a parallelization with the map-reduce paradigm

Read more

Summary

Introduction

Computers are used to simulate different aspects of nature. Natural scientists traditionally hypothesize models underlying natural phenomena. Our goal is to overcome the need for users to have knowledge of parallel programming, as is required for using ABC-sysbio, and to make a software package available for scientists across domains These objectives were partly addressed by parallelization of SMCABC using MPI/OpenMPI (Stram, Marjoram, and Chen 2015), and by making SMCABC available for the astronomical community (Jennings and Madigan 2017). In many real-world problems, the analytic form of the posterior distribution is unknown because the likelihood is not analytically available This is typical for simulator-based models for which the likelihood function is often intractable or difficult to compute (as for instance the Lorenz model above or other integrations of stochastic differential equation models), and the inference schemes are adapted following two alternative approaches: (i) by measuring the discrepancy between simulated and observed dataset, and (ii) by approximating the likelihood function

Measuring discrepancy
Approximate likelihood
Implemented algorithms
Modular API
API design decisions
Parallelism
Performance evaluation
Dynamic allocation for MPI
Parallelism and ABC algorithms
Innovations of ABCpy compared with similar packages
Learning summary statistics
Probabilistic dependency between random variables
Joint perturbation kernels
Nested parallelization
Convergence diagnostic tools
Discussion
Findings
Details on parameter inference in the Lorenz95 model
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.