Abstract
The present study uses FACETS, a many-facet Rasch model measurement computer program, to explore the differences in rater severity and consistency among computer automatic scoring and 15 expert raters’ rating on 215 examinees’ speaking records derived from a mock examination of the Computer-based English Listening-Speaking Test (Guangdong). It finds that the rater severity differences among computer automatic scoring and expert raters’ rating do not exert decisive influences on examinees’ score distribution. The low bias rate of computer automatic scoring indicates that computer automatic scoring is better than human raters in terms of inner-consistency.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have