Abstract

Probabilistic models of competence assessment join the benefits of automation with human judgment. We start this paper by replicating two preexisting probabilistic models of peer assessment (PG1-bias and PAAS). Despite the use that both make of probability theory, the approach of these models is radically different. While PG1-bias is purely Bayesian, PAAS models the evaluation process in a classroom as a multiagent system, where each actor relies on the judgment of others as long as their opinions coincide. To reconcile the benefits of Bayesian inference with the concept of trust posed in PAAS, we propose a third peer evaluation model that considers the correlations between any pair of peers who have evaluated someone in common: PG-bivariate. The rest of the paper is devoted to a comparison with synthetic data from these three models. We show that PG1-bias produces predictions with lower root mean squared error (RMSE) than PG-bivariate. However, both models display similar behaviors when assessing how to choose the next assignment to be graded by a peer, with an “RMSE decreasing policy” reporting better results than a random policy. Fair comparisons among the three models show that PG1-bias makes the lowest error in situations of scarce ground truths. Nevertheless, once nearly 20% of the teacher’s assessments are introduced, PAAS sometimes exceeds the quality of PG1-bias’ predictions by following an entropy minimization heuristic. PG-bivariate, our new proposal to reconcile PAAS’ trust-based approach with PG1-bias’ theoretical background, obtains a similar percentage of error values to those of the original models. Future work includes applying the models to real experimental data and exploring new heuristics to determine which teacher’s grade should be obtained next to minimize the overall error.

Highlights

  • Automated assessment and feedback of open-response assignments remain a challenge in computer science despite recent milestones in natural language processing

  • The variability of answers to open questions and their challenges to natural language processing (NLP) techniques make automatic assessment a limited tool to deal with this task

  • Many authors have proposed peer assessment techniques as a promising alternative to speed up evaluation in online courses

Read more

Summary

Introduction

Automated assessment and feedback of open-response assignments remain a challenge in computer science despite recent milestones in natural language processing. Despite the efforts on opening the black box of neural networks, current neural models are rarely equipped with logical narratives of the decision chains that lead them to a final prediction or classification. Many researchers propose hybrid solutions combining the benefits of automation with human judgment. They propose using peer assessments to help the teacher in the evaluation of students in large classrooms. Numerous studies from the field of psychology point out that peer evaluation methods positively impact students’ formative process, leading toself-reflection [1,2,3]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call