The sceptical Bayes factor for the assessment of replication success

Samuel Pawel ,Leonhard Held

doi:10.5167/uzh-196009

Abstract

There is an urgent need to develop new methodology for the design and analysis of replication studies. Recently, a reverse-Bayes method called the sceptical $p$-value has been proposed for this purpose; the inversion of Bayes' theorem allows us to mathematically formalise the notion of scepticism, which in turn can be used to assess the agreement between the findings of an original study and its replication. However, despite its Bayesian nature, the method relies on tail probabilities as primary inference tools. Here, we present an extension that uses Bayes factors as an alternative means of quantifying evidence. This leads to a new measure for evaluating replication success, the sceptical Bayes factor: Conceptually, the sceptical Bayes factor provides a bound for the maximum level of evidence at which an advocate of the original finding can convince a sceptic who does not trust it, in light of the replication data. While the sceptical $p$-value can only quantify the conflict between the sceptical prior and the observed replication data, the sceptical Bayes factor also takes into account how likely the data are under the posterior distribution of the effect conditional on the original study, allowing for stronger statements about replication success. Moreover, the proposed method elegantly combines traditional notions of replication success; it ensures that both studies need to show evidence against the null, while at the same time penalising incompatibility of their effect estimates. Case studies from the Reproducibility Project: Cancer Biology and the Social Sciences Replication Project show the advantages of the method for the quantitative assessment of replicability.

Full Text