Small Sample Studies to Detect Flaws in Item Translations

Jose Muniz,Ronald K Hambleton,Dehui Xing

doi:10.1207/s15327574ijt0102_2

Abstract

The number of tests being translated and adapted from 1 language and culture to others is increasing substantially. One shortcoming in current methodology for identifying flawed items due to the test translation-adaptation process is the failure to carry out empirical analyses. One important reason for not conducting empirical studies is the view that large examinee samples are required that are often not available in translation-adaptation studies. The purpose of this article was to investigate 2 simple procedures for detecting potentially flawed items with small samples: (a) conditional item p value comparisons, and (b) delta plots. Several factors were varied in this computer simulation study: sample sizes and ability distributions of the reference and focal groups, amount of differential item functioning (DIF), and the statistical characteristics of the items where DIF was found. The findings showed that the 2 simple graphical-descriptive procedures can be valuable in identifying flawed test items, especially when the size of the flaws is substantial. An application of both procedures to actual test data also supported their utility. Although this study was stimulated by questions that have arisen in the context of language translations of tests, the procedures for identifying potentially flawed items are equally applicable for identifying other potential sources of bias in the test items such as gender and race.1

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Small Sample Studies to Detect Flaws in Item Translations

Abstract

Talk to us

Similar Papers

More From: International Journal of Testing

Lead the way for us

Journal: International Journal of Testing	Publication Date: Jun 1, 2001
Citations: 63

Similar Papers

Accuracy of DIF Estimates and Power in Unbalanced Designs Using the Mantel–Haenszel DIF Detection Procedure
Insu Paek ... Hongwen Guo
Applied Psychological Measurement | VOL. 35
Insu Paek, et. al.Insu Paek ... Hongwen Guo
01 Oct 2011
Applied Psychological Measurement | VOL. 35

The effect of large ability differences on type I error and power rates using SIBTEST and TESTGRAF DIF detection procedures

-

01 Apr 2002
01 Apr 2002

DIF DETECTION SENSITIVITY OF LORD’S CHI-SQUARE, RAJU’S AREA, LOGISTIC REGRESSION, MANTEL-HAENSZEL, STANDARDIZATION, AND TRANSFORMED ITEM DIFFICULTIES METHODS, IN COMPARISON, USING R.

EPRA International Journal of Multidisciplinary Research (IJMR) | VOL. -

27 Jul 2021
EPRA International Journal of Multidisciplinary Research (IJMR) | VOL. -

Performance of SIBTEST When the Percentage of DIF Items is Large
Mark J Gierl ... Keith A Boughton
Applied Measurement in Education | VOL. 17
Mark J Gierl, et. al.Mark J Gierl ... Keith A Boughton
01 Jul 2004
Applied Measurement in Education | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Small Sample Studies to Detect Flaws in Item Translations

Abstract

Talk to us

Similar Papers

More From: International Journal of Testing