Using simulated data, this study examined the impact of different levels of stringency of the valid case inclusion criterion on item response theory (IRT)-based true score equating over 5 years in the context of K–12 assessment when growth in student achievement is expected. Findings indicate that the use of the most stringent inclusion criterion generally yielded the most accurate results when overall root mean square error (RMSE) and bias were considered under both zero-growth and growth conditions, for both one-parameter logistic (1PL) and three-parameter logistic (3PL) IRT models, and for both fixed common item parameter (FCIP) and test characteristic curve (TCC) scaling methods. The positive impact of applying the most stringent valid case inclusion criterion was more salient with the 3PL model, under which greater classification accuracy was observed.
Read full abstract