Abstract

Recent developments in psychology (e.g., Nuzzo, 2014; Trafimow, 2014; Woolston, 2015a) are showing apparently reasonable but inherently flawed positions against data testing techniques (often called hypothesis testing techniques, even when they do not test hypotheses but assume them true for testing purposes). These positions are such as banning testing explicitly and most inferential statistics implicitly (Trafimow and Marks, 2015, for Basic and Applied Social Psychology—but see Woolston, 2015a, expanded in http://www.nature.com/news/psychology-journal-bans-p-values-1.17001), recommending substituting confidence intervals for null hypothesis significant testing (NHST) explicitly and for all other data testing implicitly (Cumming, 2014, for Psychological Science—but see Perezgonzalez, 2015a; Savalei and Dunn, 2015), and recommending research preregistration as a solution to the low publication of non-significant results (e.g., Woolston, 2015b). In reading Woolston's articles, readers' comments to such articles, and the related literature, it appears that philosophical misinterpretations of old, already discussed by, for example, Meehl (1997), Nickerson (2000), Kline (2004), and Goodman (2008), are not getting through and still need to be re-addressed today. I believe that a chief source of misinterpretations is the current NHST framework, an incompatible mishmash between the testing theories of Fisher and of Neyman-Pearson (Gigerenzer, 2004). The resulting misinterpretations have both a statistical and a theoretical background. Statistical misinterpretations of p-values have been addressed elsewhere (Perezgonzalez, 2015c), thus I reserve this article for resolving theoretical misinterpretations regarding statistical significance. The main confusions regarding statistical significance can be summarized in the following seven points (e.g. Kline, 2004): (1) significance implies an important, real effect size; (2) no significance implies a trivial effect size; (3) significance disproves the tested hypothesis; (4) significance proves the alternative hypothesis; (5) significance exonerates the methodology used; (6) no significance is explainable by bad methodology; and (7) no significance in a follow up study means a replication failure. These seven points can be discussed according to two concerns: the meaning of significance itself, and the meaning, or role, of testing. In this article I will avoid NHST and, instead, refer to either Fisher's or Neyman-Pearson's approaches, when appropriate. I will also avoid their conceptual mix-up by using different concepts, those which seem most coherent under each approach. Thus, Fisher's seeks significant results, tests data on a null hypothesis (H0) and uses levels of significance (sig) to ascertain the probability of the data under H0 (Figure ​(Figure1A).1A). Neyman-Pearson's seeks to make a decision, tests data on a main hypothesis (HM) and decides in favor of an alternative hypothesis (HA) according to a cut-off calculated a priori based on sample size (N), Type I error probability (α), effect size (MES) and power (1-β), the latter two provided by HA (Figure ​(Figure1B1B). Figure 1 Fisher's (A) and Neyman-Pearson's (B) approaches to data testing mapped onto a t-test distribution with 64 degrees of freedom.

Highlights

  • Specialty section: This article was submitted to Educational Psychology, a section of the journal Frontiers in Psychology

  • Recent developments in psychology (e.g., Nuzzo, 2014; Trafimow, 2014; Woolston, 2015a) are showing apparently reasonable but inherently flawed positions against data testing techniques. These positions are such as banning testing explicitly and most inferential statistics implicitly (Trafimow and Marks, 2015, for Basic and Applied Social Psychology—but see Woolston, 2015a, expanded in http://www.nature.com/news/ psychology-journal-bans-p-values-1.17001), recommending substituting confidence intervals for null hypothesis significant testing (NHST) explicitly and for all other data testing implicitly (Cumming, 2014, for Psychological Science—but see Perezgonzalez, 2015a; Savalei and Dunn, 2015), and recommending research preregistration as a solution to the low publication of nonsignificant results (e.g., Woolston, 2015b)

  • Statistical misinterpretations of p-values have been addressed elsewhere (Perezgonzalez, 2015c), I reserve this article for resolving theoretical misinterpretations regarding statistical significance

Read more

Summary

The meaning of significance in data testing

Kline, 2004): (1) significance implies an important, real effect size; (2) no significance implies a trivial effect size; (3) significance disproves the tested hypothesis; (4) significance proves the alternative hypothesis; (5) significance exonerates the methodology used; (6) no significance is explainable by bad methodology; and (7) no significance in a follow up study means a replication failure These seven points can be discussed according to two concerns: the meaning of significance itself, and the meaning, or role, of testing.

The Meaning of Significance
The Meaning of Data Testing
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call