Abstract
ABSTRACT Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to monitor rater drift. In this paper, three statistics, termed E-statistics, that account for the product-multinomial sampling by comparing conditional distributions, are introduced. A simulation compares performance with the paired t-test and Stuart’s Q in detecting rater drift. Both the paired t-test and Q suffered extreme Type I error inflation for certain rescore study designs. The new E-statistics maintained good Type I error control and had good power to detect rater drift across occasions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.