This study aims to develop a prediction model for paper quality assessment to support technology-assisted peer review. The prediction technique is intended to reduce the review burden, which is becoming a critical issue in today’s paper submission process. However, most existing works on this topic were built by involving the reviewers’ comments, which is considered unfair and inapplicable for reducing the review burden. Therefore, our prediction method relies only on features extracted from the paper to address this issue. The method covers three tasks as follows: two are classification tasks and one is a regression task. The classification tasks predict the final review decision (accepted-rejected) and estimate the paper quality (good-poor), while a regression task predicts the review scores. Additionally, the classification and regression tasks are implemented using three main features i.e., citing sentence features developed based on the labeling scheme of citation functions, regular sentence features created by applying the label of citation functions to non-citation text, and reference-based features constructed by identifying the source of citations. Furthermore, the classification experiments on the dataset obtained from the International Conference on Learning Representations 2017–2020 showed that our methods are more effective in the good-poor task than the accepted-rejected task by demonstrating the best accuracy of 0.75 and 0.73, respectively. Moreover, we also reached a satisfactory recall of 0.99 using only the citing sentence features to obtain as many good papers as possible in the good-poor task. Our regression experiments indicate that the best result in predicting the average review score is higher than the individual review score by showing Root Mean Square Error (RMSE) of 1.34 and 1.71, respectively.
Read full abstract