Abstract

Common estimators of interrater reliability include Pearson product-moment correlation coefficients, Spearman rank-order correlations, and the generalizability coefficient. The purpose of this study was to examine the accuracy of estimators of interrater reliability when varying the true reliability, number of scale categories, and number of essays rated. This research used Monte Carlo methods to draw samples from known population models to examine the accuracy of select estimators of interrater reliability between two raters. In addition to the estimates shown above, we included the polychoric correlation coefficient based on its alignment with the context in which student language assessments are rated. Although each estimator produced an estimate close to the population parameter, polychoric correlations provided the closest estimates with mean and median bias equal to 0.00 (SD = 0.05) across all conditions. The use of Pearson product-moment and Spearman rank-order correlation coefficients might result in the underestimation of interrater reliability by as much as a third.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.