Abstract
ABSTRACTWhen inferring unknown parameters or comparing different models, data must be compared to underlying theory. Even if a model has no closed-form solution to derive summary statistics, it is often still possible to simulate mock data in order to generate theoretical predictions. For realistic simulations of noisy data, this is identical to drawing realizations of the data from a likelihood distribution. Though the estimated summary statistic from simulated data vectors may be unbiased, the estimator has variance that should be accounted for. We show how to correct the likelihood in the presence of an estimated summary statistic by marginalizing over the true summary statistic in the framework of a Bayesian hierarchical model. For Gaussian likelihoods where the covariance must also be estimated from simulations, we present an alteration to the Sellentin–Heavens corrected likelihood. We show that excluding the proposed correction leads to an incorrect estimate of the Bayesian evidence with Joint Light-Curve Analysis data. The correction is highly relevant for cosmological inference that relies on simulated data for theory (e.g. weak lensing peak statistics and simulated power spectra) and can reduce the number of simulations required.
Highlights
It is increasingly common, especially in cosmological surveys, to attempt to make inferences from data d using theory summary statistics μ that can be obtained only from simulations.One example, currently popular in cosmology, is weak lensing peak statistics (Dietrich & Hartlap 2010; Kacprzak et al 2016; Peel et al 2017; Shan et al 2018; Martinet et al 2018)
Peak statistics broadly aim to use the number of density peaks in the cosmological matter distribution to constrain cosmological parameters and models
1 − ln det μ) † Σ −1
Summary
Especially in cosmological surveys, to attempt to make inferences from data d using theory summary statistics μ that can be obtained only from simulations. In this work we note that, as with the estimated covariance described by Sellentin & Heavens (2016), an unbiased estimated summary statistic μis itself a random variable, drawn from a sampling distribution with associated variance. If unaccounted for, this will lead to inaccurate parameter inference and misleading model comparison results. In general LFI methods assume that the likelihood is unknown, and simulations are used to estimate the resulting posterior distribution conditional on data. Each plate (rectangular box) includes the amount of data associated with the variable, for example each μi (run at position i in parameter space) comes from Mi simulations
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.