AbstractThis study investigated the test fairness of the translation section of a large-scale English test in China by examining its Differential Test Functioning (DTF) and Differential Item Functioning (DIF) across gender and major. Regarding DTF, the entire translation section exhibits partial strong measurement invariance across female and male test takers, while exhibiting full measurement invariance across test takers in (1) arts & humanities and social sciences (A&HSS) and (2) science, technology, engineering or mathematics (STEM) majors. No major-based DIF was detected in this study. Objective test items tend to favor male test takers, while direct translation test task was more favorable to females. Combining the DIF and DTF results, there may be a cancelation effect in our case. However, the effect size of DIF is either negligible or slight to moderate, indicating minimal impact on the overall fairness of the translation test task. This study further discusses the necessity of exploring the source of DIF and the importance of combining DIF and DTF for test fairness research.
Read full abstract