Abstract
ABSTRACT We present a novel Bayesian inference tool that uses a neural network (NN) to parametrize efficient Markov Chain Monte Carlo (MCMC) proposals. The target distribution is first transformed into a diagonal, unit variance Gaussian by a series of non-linear, invertible, and non-volume preserving flows. NNs are extremely expressive, and can transform complex targets to a simple latent representation. Efficient proposals can then be made in this space, and we demonstrate a high degree of mixing on several challenging distributions. Parameter space can naturally be split into a block diagonal speed hierarchy, allowing for fast exploration of subspaces where it is inexpensive to evaluate the likelihood. Using this method, we develop a nested MCMC sampler to perform Bayesian inference and model comparison, finding excellent performance on highly curved and multimodal analytic likelihoods. We also test it on Planck 2015 data, showing accurate parameter constraints, and calculate the evidence for simple one-parameter extensions to the standard cosmological model in ∼20D parameter space. Our method has wide applicability to a range of problems in astronomy and cosmology and is available for download from https://github.com/adammoss/nnest.
Highlights
In the last few years, we have witnessed a revolution in machine learning
Deep learning is suited to the era of datadriven astronomy and cosmology, but so far applications have mainly focused on supervised learning tasks such as classification and regression
A major challenge in nested sampling is drawing new samples from a constrained target distribution, and we show that neural network (NN) can lead to improved performance over existing rejection and Markov Chain Monte Carlo (MCMC)-based approaches
Summary
In the last few years, we have witnessed a revolution in machine learning. The use of deep neural networks (NNs) has become widespread due to increased computational power, the availability of large data sets, and their ability to solve problems previously deemed intractable (see Lecun, Bengio & Hinton 2015 for an overview). The proposal function can be trained, for example, to minimize the autocorrelation length of the chain Some of these methods (e.g. generalizations of Hamiltonian Monte Carlo) exploit the gradient of the target, and analytic gradients are often not available for astronomical or cosmological models. We use an NN to transform the likelihood to a simpler representation, which requires no gradient information and is very fast to train This approach is inspired by representation learning, which hypotheses that deep NNs have the potential to yield representation spaces in which Markov chains mix faster (Bengio, Courville & Vincent 2012a; Bengio et al 2012b).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.