Abstract

Our purpose in this study is to examine the similarity of 2 item response theory (IRT) equating procedures, the similarity of equipercentile equating, and the 2 IRT equating procedures, and the relation between the discrepancies in the equating results and the differences in the difficulty of the 2 equated forms. The findings revealed that (a) the IRT observed-score equating procedure resulted in observed-score distributions with larger standard deviations, larger means for positively skewed distributions of raw scores, and smaller means for negatively skewed distribution of raw scores; (b) the equating loss produced by the unsmoothed equipercentile equating fluctuated across the total test scores; (c) IRT true-score equating produced more stable equating results than the other 2 equating methods; however, the mean differences in equating stability among the 3 equatings were very small, and especially, the mean differences in equating stability between the 2 IRT equatings were statistically insignificant; (d) IRT observed-score equating produced more stable equating results than equipercentile equating; and (e) it seems that the larger equating differences between any 2 of the 3 equating methods were produced when the difficulty differences between the 2 equated forms were larger.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call