Abstract

Several systematic studies have suggested that a large fraction of published research is not reproducible. One probable reason for low reproducibility is insufficient sample size, resulting in low power and low positive predictive value. It has been suggested that insufficient sample-size choice is driven by a combination of scientific competition and 'positive publication bias'. Here we formalize this intuition in a simple model, in which scientists choose economically rational sample sizes, balancing the cost of experimentation with income from publication. Specifically, assuming that a scientist's income derives only from 'positive' findings (positive publication bias) and that individual samples cost a fixed amount, allows to leverage basic statistical formulas into an economic optimality prediction. We find that if effects have i) low base probability, ii) small effect size or iii) low grant income per publication, then the rational (economically optimal) sample size is small. Furthermore, for plausible distributions of these parameters we find a robust emergence of a bimodal distribution of obtained statistical power and low overall reproducibility rates, both matching empirical findings. Finally, we explore conditional equivalence testing as a means to align economic incentives with adequate sample sizes. Overall, the model describes a simple mechanism explaining both the prevalence and the persistence of small sample sizes, and is well suited for empirical validation. It proposes economic rationality, or economic pressures, as a principal driver of irreproducibility and suggests strategies to change this.

Highlights

  • Systematic attempts at replicating published research have produced disquietingly low reproducibility rates, often below 50% [1,2,3,4,5]

  • With infinite sample size and resulting infinite power, the total publishable rate approaches the rate of true effects (b) plus a fraction of false positives (α × (1 − b))

  • We find that conditional equivalence testing (CET) with Δ = 0.5d leads to improved power and positive predictive value (PPV) for most parameter distributions, but for more realistic ones (Fig 4E, low and low/ bimodal distributions of b)

Read more

Summary

Introduction

Systematic attempts at replicating published research have produced disquietingly low reproducibility rates, often below 50% [1,2,3,4,5]. A recent survey suggests that a vast majority of scientists believe we are currently in a ‘reproducibility crisis’ [6]. While the term ‘crisis’ is contested [7], the available evidence on reproducibility certainly raises questions. One likely reason for low reproducibility rates is insufficient sample size and resulting low statistical power and positive predictive value [8,9,10,11,12]. In the most prevalent scientific statistical framework, i.e. nullhypothesis-significance-testing (NHST), the statistical power of a study is the probability to detect a hypothesized effect with a given sample size.

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.