Abstract
Null hypothesis significance testing has been under attack in recent years, partly owing to the arbitrary nature of setting α (the decision-making threshold and probability of Type I error) at a constant value, usually 0.05. If the goal of null hypothesis testing is to present conclusions in which we have the highest possible confidence, then the only logical decision-making threshold is the value that minimizes the probability (or occasionally, cost) of making errors. Setting α to minimize the combination of Type I and Type II error at a critical effect size can easily be accomplished for traditional statistical tests by calculating the α associated with the minimum average of α and β at the critical effect size. This technique also has the flexibility to incorporate prior probabilities of null and alternate hypotheses and/or relative costs of Type I and Type II errors, if known. Using an optimal α results in stronger scientific inferences because it estimates and minimizes both Type I errors and relevant Type II errors for a test. It also results in greater transparency concerning assumptions about relevant effect size(s) and the relative costs of Type I and II errors. By contrast, the use of α = 0.05 results in arbitrary decisions about what effect sizes will likely be considered significant, if real, and results in arbitrary amounts of Type II error for meaningful potential effect sizes. We cannot identify a rationale for continuing to arbitrarily use α = 0.05 for null hypothesis significance tests in any field, when it is possible to determine an optimal α.
Highlights
A well-known problem associated with null hypothesis significance tests (NHST) is the arbitrariness of the chosen experimental significance level, alpha (a)
To provide examples of how the optimal a approach can be applied for actual scientific research, we decided to re-analyse the results of two simple, typical null hypothesis significance tests from high-profile journals and one null hypothesis significance test used as an example in a statistics textbook
The only difference between using an optimal a or a standard a is whether critical effect sizes and relative costs are thoughtfully considered and stated, or implied and unstated
Summary
A well-known problem associated with null hypothesis significance tests (NHST) is the arbitrariness of the chosen experimental significance level, alpha (a). The ease with which a correctly-interpreted null hypothesis significance test can be used as a decision-making tool causes it to continue to be favoured in most scientific fields. The goal of these tests should be to provide us with conclusions in which we have the highest possible confidence. The logical decision-making significance threshold, a, should be the value that minimizes the probability, or occasionally, the cost of making any relevant error In the former case, this would make the goal of the statistical test to avoid making an erroneous conclusion, while in the latter it would make the goal of the statistical test to avoid making a costly erroneous conclusion. We feel that doing statistics for purposes other than these would be outside the realms of pure and applied science
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have