Abstract

Generalized Bayes posterior distributions are formed by putting a fractional power on the likelihood before combining with the prior via Bayes’s formula. This fractional power, which is often viewed as a remedy for potential model misspecification bias, is called the learning rate, and a number of data-driven learning rate selection methods have been proposed in the recent literature. Each of these proposals has a different focus, a different target they aim to achieve, which makes them difficult to compare. In this paper, we provide a direct head-to-head empirical comparison of these learning rate selection methods in various misspecified model scenarios, in terms of several relevant metrics, in particular, coverage probability of the generalized Bayes credible regions. In some examples all the methods perform well, while in others the misspecification is too severe to be overcome, but we find that the so-called generalized posterior calibration algorithm tends to outperform the others in terms of credible region coverage probability.

Highlights

  • Specification of a sound model is a critical part of an effective statistical analysis. This is especially true for a Bayesian approach, since the statistical model or likelihood is explicitly used to construct the posterior distribution from which inferences will be drawn

  • There we present the average learning rate value, the coverage probability of 95% credible intervals for θ, the average length of those credible intervals, and mean square error, all based on 500 replications, for each pair of μ and sample size n

  • In the Degree 1 case where misspecification is relatively mild, the methods perform reasonably well in terms of coverage probability, but things get worse as sample size increases, a symptom of the model misspecification bias

Read more

Summary

Introduction

Specification of a sound model is a critical part of an effective statistical analysis. Under suitable conditions, the posterior will concentrate around a “best” parameter value, one that minimizes the Kullback–Leibler divergence of the posited model from the true data-generating distribution. Even in those relatively nice cases, where the best parameter value around which the posterior concentrates could be meaningful, or even equal to the real quantity. Πn will concentrate its mass around θ as n → ∞ and, the Bernstein–von Mises theorem (e.g., van der Vaart, 2000, Chapter 10) states that Πn is approximately a normal distribution, centered at the maximum likelihood estimator θn, with covariance matrix proportional to the inverse of the Fisher information matrix at θ This implies, among other things, that credible regions derived from Πn closely resemble those asymptotic confidence regions based on likelihood theory. Asymptotically, the Bayesian posterior credible regions will have frequentist coverage probability close to the advertised level

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call