Abstract

This study investigates the relative efficacy of using n-grams extracted terms, the aggregation of such terms, and a combination of feature extraction techniques in building an automated essay-type grading (AETG) system. The paper focused on the modification of the Principal Component Analysis (PCA) by integrating n-grams terms as input into the PCA algorithm. Hardcopies of examiners' marking schemes and softcopies of students' answers for two courses, Management Information System (COM 317) and Research Methodology (COM 325), offered at the Department of Computer Science, Federal Polytechnic, Ilaro, during 2013/2014 academic session were used as case studies. The textual contents of the marking schemes were transcripted into electronic documents using same file format as the students' answers. The documents were pre-processed for stopwords removal and each keyword stemmed to address morphological variations. N-gram terms (N=2, 3) were then extracted across all students' answer scripts and marking scheme documents for each of the two courses. The documents were represented in the vector space model as a Document Term Matrix. Principal Component Analysis (PCA) algorithm was modified by integrating n-gram terms as input into existing PCA to derive Modified Principal Component Analysis (MPCA) algorithm. The MPCA was used to reduce the sparseness of the matrix. Document similarity was measured using cosine similarity measure which compared each student's answer script document vector with the marking scheme document vector. The MPCA based AETG system outperformed the PCA equivalent having a high positive correlation and lower mean absolute error when the human marker scores are compared to those of the system. We intend to explore other approaches that will able to capture non-textual contents in our future work.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.