The Effects of Test Length and Sample Size on the Reliability and Equating of Tests Composed of Constructed-Response Items

Anne R Fitzpatrick,Wendy M Yen

doi:10.1207/s15324818ame1401_04

The Effects of Test Length and Sample Size on the Reliability and Equating of Tests Composed of Constructed-Response Items

Anne R Fitzpatrick, Wendy M Yen

https://doi.org/10.1207/s15324818ame1401_04

Copy DOI

Journal: Applied Measurement in Education	Publication Date: Jan 1, 2001
Citations: 16

#Effects Of Test Length #Score Points + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Examined in this study were the effects of test length and sample size on the alternate forms reliability and the equating of simulated mathematics tests composed of constructed-response items scaled using the 2-parameter partial credit model. Test length was defined in terms of the number of both items and score points per item. Tests with 2, 4, 8, 12, and 20 items were generated, and these items had 2, 4, and 6 score points. Sample sizes of 200, 500, and 1,000 were considered. Precise item parameter estimates were not found when 200 cases were used to scale the items. To obtain acceptable reliabilities and accurate equated scores, the findings suggested that tests should have at least eight 6-point items or at least 12 items with 4 or more score points per item.

Full Text