Abstract

With increased emphasis on teacher quality in the Race to the Top federal grants program, rater agreement is an important topic in teacher evaluation. Variations of kappa have often been used to assess inter-rater reliability (IRR). Research has shown that kappa suffers from a paradox where high exact agreement can produce low kappa values. Two chance-corrected methods of IRR were examined to determine if Gwet’s AC1 statistic is a more stable estimate than kappa. Findings suggest that Gwet’s AC1 statistic outperforms kappa as a chance-corrected measure of IRR when compared to exact agreement. Findings suggest Gwet’s AC1 statistic shows promise for future IRR studies in a teacher evaluation context.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call