Abstract

AbstractThe application of null hypothesis testing to psychological research has been much criticized (e.g., Bakan, 1966; Gigerenzer & Murray, 1987). I examine the specific relevance of these and other criticisms for research on sex differences. Four specific problems are identified: (1) drawing inferences about general properties that are attributed to all members of a population; (2) the distinction between the size of p and the size and theoretical importance of a difference; (3) the frequently unjustified assumption of normality; and (4) the semantic problems inherent in the language of interpretation. A few solutions are explored, and it is concluded that descriptions of the results of sex comparisons, as well as others, must reflect the data more accurately than they now do. A number of years ago, in an invited address that I gave at the annual meeting of the Canadian Psychological Association, I criticized the way in which standard null hypothesis statistics are used in research on sex differences. I argued that in just about every known comparison of the behaviour of females and males there is generally a substantial amount of overlap between the distributions of the dependent variable for the members of each sex (if the behaviour in question is physically possible for both). This is true even if a statistical test has led to the inference that the difference is significant (p .01, p Notwithstanding such extreme cases of overlap, the mere fact that any overlap occurs at all, I reasoned, renders logically false the descriptive statements that are made when differences are said to be significant. For example, to cite a currently contentious issue, suppose it is found that the average score obtained by boys on a math test is significantly higher than that obtained by girls. Since these scores invariably overlap, (see, e.g., Benbow, 1988, p. 219; Hyde et. al., 1990), then it is false to translate the statistical result into an affirmation which states that boys are better at math than are girls. I should perhaps add that this logic also applies when girls obtain higher average scores than boys, a result which has been observed in a larger variety of situations than seems to be commonly known (Kimball, 1989).During the question period, I was asked how it is possible to make a special case that null hypothesis testing is inappropriate for sex differences, when these tests were in general use for most other kinds of psychological data. My questioner clearly did not mean to imply that significance testing should be discarded altogether (or so I understood him) but rather, since these types of statistical procedures are the standard tools of contemporary experimental psychology, and therefore must be valid, then it could hardly be legitimate to make a special case for their irrelevance to sex differences. To my embarrassment, I was unable at that time to give a satisfactory reply, neither to the questioner nor to myself. On the one hand, I was convinced that my reasoning was correct as far as sex differences are concerned, yet, on the other hand, I was still so attached to significance tests as a general tool for psychological research that I was not yet ready to think that their application could be more generally problematical.Over the intervening years I have discovered a rather substantial literature that is critical of the ways in which null hypothesis testing has been integrated into the entire psychological research process. …

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call