Abstract

ABSTRACT When inferring unknown parameters or comparing different models, data must be compared to underlying theory. Even if a model has no closed-form solution to derive summary statistics, it is often still possible to simulate mock data in order to generate theoretical predictions. For realistic simulations of noisy data, this is identical to drawing realizations of the data from a likelihood distribution. Though the estimated summary statistic from simulated data vectors may be unbiased, the estimator has variance that should be accounted for. We show how to correct the likelihood in the presence of an estimated summary statistic by marginalizing over the true summary statistic in the framework of a Bayesian hierarchical model. For Gaussian likelihoods where the covariance must also be estimated from simulations, we present an alteration to the Sellentin–Heavens corrected likelihood. We show that excluding the proposed correction leads to an incorrect estimate of the Bayesian evidence with Joint Light-Curve Analysis data. The correction is highly relevant for cosmological inference that relies on simulated data for theory (e.g. weak lensing peak statistics and simulated power spectra) and can reduce the number of simulations required.

Highlights

  • It is increasingly common, especially in cosmological surveys, to attempt to make inferences from data d using theory summary statistics μ that can be obtained only from simulations.One example, currently popular in cosmology, is weak lensing peak statistics (Dietrich & Hartlap 2010; Kacprzak et al 2016; Peel et al 2017; Shan et al 2018; Martinet et al 2018)

  • Peak statistics broadly aim to use the number of density peaks in the cosmological matter distribution to constrain cosmological parameters and models

  • 1 − ln det μ) † Σ −1

Read more

Summary

INTRODUCTION

Especially in cosmological surveys, to attempt to make inferences from data d using theory summary statistics μ that can be obtained only from simulations. In this work we note that, as with the estimated covariance described by Sellentin & Heavens (2016), an unbiased estimated summary statistic μis itself a random variable, drawn from a sampling distribution with associated variance. If unaccounted for, this will lead to inaccurate parameter inference and misleading model comparison results. In general LFI methods assume that the likelihood is unknown, and simulations are used to estimate the resulting posterior distribution conditional on data. Each plate (rectangular box) includes the amount of data associated with the variable, for example each μi (run at position i in parameter space) comes from Mi simulations

Posterior and likelihood
Likelihood correction
Bayesian Hierarchical Model
GAUSSIAN NAIVE LIKELIHOOD
Known Covariance
Unknown Covariance
TOY MODEL DEMONSTRATION
JLA SUPERNOVAE DEMONSTRATION
Data and Model
Likelihood and Priors
Results
Model Comparison
DISCUSSION & CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call