Selective Inference: The Silent Killer of Replicability

Yoav Benjamini

doi:10.1162/99608f92.fc62b261

Abstract

Replicability of results has been a gold standard in science and should remain so, but concerns about lack of it have increased in recent years. Transparency, good design, and reproducible computing and data analysis are prerequisites for replicability. Adopting appropriate statistical methodologies is another identified one, yet which methodologies can be used to enhance replicability of results from a single study remains controversial. Whereas the p-value and statistical significance are carrying most of the blame, this article argues that addressing selective inference is a missing statistical cornerstone of enhancing replicability. I review the manifestation of selective inference and the available ways to address it. I also discuss and demonstrate whether and how selective inference is addressed in many fields of science, including the attitude of leading scientific publications as expressed in their recent editorials. Most notably, selective inference is attended when the number of potential findings from which the selection takes place is in the thousands, but it is ignored when âonlyâ dozens and hundreds of potential discoveries are involved. As replicability, and its closely related concept of generalizability, can only be assessed by actual replication attempts, the question of how to make replication an integral part of the regular scientific work becomes crucial. I outline a way to ensure that some replication effort will be an inherent part of every study. This approach requires the efforts and cooperation of all parties involved: scientists, publishers, granting agencies, and academic leaders.

Highlights

Replicability of results has been a gold standard in science and should remain so, but concerns about lack of it have increased in recent years
Whereas the p-value and statistical significance are carrying most of the blame, this article argues that addressing selective inference is a missing statistical cornerstone of enhancing replicability
Taking a more extreme stand, the Journal of Basic and Applied Social Psychology banned the use of p-values and discouraged the use of any statistical methods (Trafimow & Marks, 2015), taking us back to the 19th century when the results of studies were reported merely by tables and figures, with no quantitative assessment of the uncertainties involved. (For unfortunate implications of the ban on the results reported in that journal a year later, see Fricker et al, 2019.)

Summary

The Reproducibility and Replicability Crisis

Experimental science has been based on the paradigm that a result obtained from a one-time experiment is insufficient to establish the validity of a discovery. Offering statistical significance as formulated by the p value less than a threshold, Fisher further states: “we may say that a phenomenon is experimentally demonstrable when we know how to conduct an experiment, which will rarely fail to give us a statistically significant result.” This rule for a replicated discovery has served science well for almost a century, despite the philosophical disputes surrounding it. The retractions of Diederik Stapel’s works because of falsifications and fabrications of data (Levelt et al, 2012,) in particular, served as the ultimate proof that most of current science is false Such concerns led to the Psychological Reproducibility Project, where the main result of each of 100 papers from three leading journals in the field was tested for replication, again by others. This is well demonstrated by the journal Science, setting up at that time a statistical editorial board that has been chartered with examining submitted papers for potential statistical problems

The Misguided Attack

Selective Inference

Simultaneous Inference

On the Average Over the Selected

Conditional Inference

Sample-Splitting

The Status of Addressing Selective Inference

Large-Scale Research

Medical Research

Experimental Psychology

Open Science Framework

Bayesian Approach

Nature

General

Findings

Replication as a Way of Life in Scientific Work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Harvard Data Science Review	Publication Date: Dec 16, 2020
Citations: 21	License type: cc-by

R Discovery Prime

R Discovery Prime

Selective Inference: The Silent Killer of Replicability

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Harvard Data Science Review

Lead the way for us

Similar Papers

Collaborative and reproducible simulation and data analysis with Sumatra
Davison Andrew
Frontiers in Neuroinformatics | VOL. 5
Davison AndrewDavison Andrew
01 Jan 2010
Frontiers in Neuroinformatics | VOL. 5

BioInstaller: a comprehensive R package to construct interactive and reproducible biological data analysis applications based on the R platform.
Jianfeng Li ... Yuting Dai
PeerJ | VOL. 6
Jianfeng Li, et. al.Jianfeng Li ... Yuting Dai
31 Oct 2018
PeerJ | VOL. 6

Better antimicrobial resistance data analysis and reporting in less time.
Christian F Luz ... Bhanu Sinha
JAC-Antimicrobial Resistance | VOL. 5
Christian F Luz, et. al.Christian F Luz ... Bhanu Sinha
29 Dec 2022
JAC-Antimicrobial Resistance | VOL. 5

Accessible and reproducible mass spectrometry imaging data analysis in Galaxy.
Melanie Christine Föll ... Martin Werner
GigaScience | VOL. 8
Melanie Christine Föll, et. al.Melanie Christine Föll ... Martin Werner
01 Dec 2019
GigaScience | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Selective Inference: The Silent Killer of Replicability

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Harvard Data Science Review