Abstract

ABSTRACTA pool of items from operational tests of mathematical reasoning was constructed to investigate the feasibility of using automated test assembly methods to simultaneously moderate possibly irrelevant differences between the performance of women and men and African‐American and White test takers. None of the artificial tests investigated exhibited substantial impact moderation, although the estimated mean scaled score differences for the relevant population indicated a modest move in the intended direction: the difference between scaled score means was reduced by about 20% for women and men and about 9% for African‐American and White test takers. Although many issues in the implementation of this methodology remain to be solved, the consideration of impact in automated test assembly along with the maintenance of the detailed test plan appears to be a potential method of moderating possibly irrelevant mean test score differences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call