The stress of marking assessment scripts of many candidates often results in fatigue that could lead to low productivity and reduced consistency. In most cases, candidates use words, phrases and sentences that are synonyms or related in meaning to those stated in the marking scheme, however, examiners rely solely on the exact words specified in the marking scheme. This often leads to inconsistent grading and in most cases, candidates are disadvantaged. This study seeks to address these inconsistencies during assessment by evaluating the marked answer scripts and the marking scheme of Introduction to File Processing (CSC 221) from the Department of Computer Science, University of Uyo, Nigeria. These were collected and used with the Microsoft Research Paraphrase (MSRP) corpus. After preprocessing the datasets, they were subjected to Logistic Regression (LR), a machine learning technique where the semantic similarity of the answers of the candidates was measured in relation to the marking scheme of the examiner using the MSRP corpus model earlier trained on the Term Frequency-Inverse Document Frequency (TF-IDF) vectorization. Results of the experiment show a strong correlation coefficient of 0.89 and a Mean Relative Error (MRE) of 0.59 compared with the scores awarded by the human marker (examiner). Analysis of the error indicates that block marks were assigned to answers in the marking scheme while the automated marking system breaks the block marks into chunks based on phrases both in the marking scheme and the candidates’ answers. It also shows that some semantically related words were ignored by the examiner.
Read full abstract