Abstract

ABSTRACTBased on initial SAT‐Verbal pretest data and/or hypotheses advanced in the research literature, the authors selected 7 sentence completions and 16 analogies with extreme levels of differential item functioning (DIF) and then systematically revised and readministered the items in an attempt to reduce or eliminate DIF. Several diverse conclusions can be drawn from the data. First, because of the apparent success in reducing extreme levels of DIF in SAT‐Verbal items, the authors recommend that such efforts be continued. Second, the particular terminology used in stems and keys (rather than the underlying reasoning skill being measured) seems to be a recurring source of DIF in SAT‐Verbal items. Third, larger sample sizes, particularly for minority focal groups, would help to stabilize the DIF categories used by Educational Testing Service (ETS) test developers. Fourth, because the ETS delta metric is unbounded at the extremes, the use of both the Standardization (p‐metric) and Mantel‐Haenszel (delta‐metric) methodologies is recommended for classifying the level of DIF for very easy and very difficult items. Finally, the paper concludes with a suggestion for further research concerning the possible relationship between DIF and predictive validity.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.