Proponents of what has been termed the Gender Similarities Hypothesis (GSH) have typically relied on meta-analyses as well as the generation of nonsignificant tests of mean differences to support their argument that the genders are more similar than they are different. In the present article, we argue that alternative statistical methodologies, such as tests of equivalence, can provide more accurate (yet equally rigorous) tests of these hypotheses and therefore might serve to complement, challenge, and/or extend findings from meta-analyses. To demonstrate and test the usefulness of such procedures, we examined Scholastic Aptitude Test–Math (SAT-M) data to determine the degree of similarity between genders in the historically gender-stereotyped field of mathematics. Consistent with previous findings, our results suggest that men and women performed similarly on the SAT-M for every year that we examined (1996–2009). Importantly, our statistical approach provides a greater opportunity to open a dialogue on theoretical issues surrounding what does and what should constitute a meaningful difference in intelligence and achievement. As we note in the discussion, it remains important to consider whether even very small but consistent gender differences in mean test performance could reflect stereotype threat in the testing environment and/or gender biases in the test itself that would be important to address.
Read full abstract