Abstract
To many, the foundations of statistical inference are cryptic and irrelevant to routine statistical practice. The analysis of 2 x 2 contingency tables, omnipresent in the scientific literature, is a case in point. Fisher's exact test is routinely used even though it has been fraught with controversy for over 70 years. The problem, not widely acknowledged, is that several different p-values can be associated with a single table, making scientific inference inconsistent. The root cause of this controversy lies in the table's origins and the manner in which nuisance parameters are eliminated. However, fundamental statistical principles (e.g., sufficiency, ancillarity, conditionality, and likelihood) can shed light on the controversy and guide our approach in using this test. In this paper, we use these fundamental principles to show how much information is lost when the tables origins are ignored and when various approaches are used to eliminate unknown nuisance parameters. We present novel likelihood contours to aid in the visualization of information loss and show that the information loss is often virtually non-existent. We find that problems arising from the discreteness of the sample space are exacerbated by p-value-based inference. Accordingly, methods that are less sensitive to this discreteness - likelihood ratios, posterior probabilities and mid-p-values - lead to more consistent inferences.
Highlights
To many, the foundations of statistical inference are cryptic and irrelevant to routine statistical practice
We show that it is the discreteness of the sample space that is most problematic, and this discreteness is exacerbated when the statistical evidence is summarized with a p-value derived by model conditioning
The Sufficiency and Conditionality Principles play an important role here as they provide a basis for the specification of a working likelihood and the elimination of nuisance parameters
Summary
The foundations of statistical inference are cryptic and irrelevant to routine statistical practice. A problem, not widely acknowledged, is that several different p-values can be associated with a single table, making scientific inference inconsistent. The analysis of 2 × 2 contingency tables has generated controversy and dispute for more than a half-century in the statistical literature, so perhaps ‘deceptively simple’ would be a better description. Many p-values, including that from Fisher’s exact test, are associated with this one table despite the fact that they all appear to test the same null hypothesis. As such, this controversy is often viewed—too simplistically—as a problem of selecting the ‘right p-value’. Perhaps the most widely applied statistical method in the scientific literature, Fisher’s exact test has elicited enormous controversy over the past 70 years. We present three main models for 2 × 2 contingency tables that will provide the basis of specifying the working likelihood
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have