Peer assessment, which requires students to evaluate their peers’ submissions, has become the paradigm for solving the grading challenge of large-scale open-ended assignments teachers face in MOOCs. Since peer grades may be biased and unreliable, a group of probabilistic graph models are proposed to improve the estimation of true scores for assignments based on peer grades, by explicitly modeling the bias and reliability of each grader. However, these models assume that graders’ reliability are only impacted by their knowledge/ ability levels while ignoring their grading behaviors. In real life, graders’ grading behaviors (e.g., the time consumed for reviewing an assignment) reflect the seriousness of the graders in the assessment and greatly affect their reliability. Following this intuition, we propose two novel probabilistic graph models, named BPG6 and BPG7, for cardinal peer assessment, which optimize the modeling of the reliability of graders by incorporating various grading behaviors of them. Besides, we develop a peer assessment system, named BPA, which collects abundant grading behaviors of graders and thus facilitates the evaluation of the proposed models and our teaching practices. Experimental results on the collected datasets show the superiority of the proposed models in improving the estimation accuracy of the true scores of assignments by leveraging graders’ grading behaviors. Finally, by analyzing the feedback from teaching practices of peer assessment using the BPA system, we not only prove the usability of the BPA system but also once again demonstrate the effectiveness of the proposed model in peer assessment.
Read full abstract