Combining Multiple Text Representations for Improved Automatic Evaluation of Indonesian Essay Answers

Moh Edi Wibowo,Agus Sihabudin,Nur Rokhman

doi:10.15294/sji.v11i3.9440

Abstract

multiple-choices, regarding students’ learning achievement. When the number of students in a class is huge, however, examinations using essay questions become hard to conduct and take a long evaluation time. Automatic essay evaluation has, therefore, become a potential approach in this situation. Various methods have been proposed, however, optimal solutions for such evaluation in the Indonesian language are less known. Furthermore, with the rapid development of machine learning approaches, in particular deep learning approaches, the investigation of such optimal solutions becomes more necessary. Method: To address the aforementioned issue, this study proposed the investigation of text representation approaches for optimal automatic evaluation of Indonesian essay answers. The investigation compared pre-trained word embedding methods such as Word2vec, GloVe, FastText, and RoBERTa, as well as compared text encoding methods such as long short-term memories (LSTMs) and transformers. LSTMs are able to capture temporal semantics by employing state variables, while transformers are able to capture long-term dependency between parts of their input sequences. Additionally, we investigated classification-based and similarity-based training to build text encoders. We expected that these training approaches allowed encoders to extract different views of information. We compared classification results produced by different text encoders and combinations of text encoders. Result: We evaluated various text representation approaches using the UKARA dataset. Our experiments showed that the FastText word embedding method outperformed the Word2vec, GloVe, and RoBERTa methods. The FastText method achieved an F1-score of 75.43% on validation sets, while the Word2vec, GloVe, and RoBERTa methods achieved F1-scores of 69.56%, 74.53%, and 72.87%, respectively. In addition, the experiments showed that combinations of text encoders outperformed individual encoders. The combination of the LSTM encoder, the transformer encoder, and the TF-IDF encoder obtained an F1-score of 77.22% in the best case, which is better than the best F1-scores of the individual LSTM encoders (75.35%), the best combination of transformer encoders (71.49%), and the individual TF-IDF encoder (76.69%). We observed that LSTM encoders produced better performance when they were built using classification-based training. Meanwhile, the transformer encoders obtained better performance when built using similarity-based training. Novelty: The novelty proposed in this research is the optimal combination of text encoders specifically constructed for the evaluation of essay answers in the Indonesian language. Our experiments showed that the combination of three encoders - namely the LSTM encoder built using classification-based training, the transformer encoder built using classification-based and similarity-based training, and the TF-IDF encoder - obtained the best classification performance.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Combining Multiple Text Representations for Improved Automatic Evaluation of Indonesian Essay Answers

Abstract

Published Version

Talk to us

Similar Papers

More From: Scientific Journal of Informatics

Lead the way for us

Journal: Scientific Journal of Informatics	Publication Date: Aug 30, 2024
License type: cc-by

Similar Papers

Sentiment Analysis on Social Media Reviews Datasets with Deep Learning Approach
Muhammet Sinan Başarslan ... Fatih Kayaalp
Sakarya University Journal of Computer and Information Sciences | VOL. 4
Muhammet Sinan Başarslan, et. al.Muhammet Sinan Başarslan ... Fatih Kayaalp
30 Apr 2021
Sakarya University Journal of Computer and Information Sciences | VOL. 4

Correlation Between Reading Literacy Ability and Achievement in Learning Indonesian Language in Grade X
Muakibatul Hasanah ... Risa Yanuarti Sholihah
ISLLAC : Journal of Intensive Studies on Language, Literature, Art, and Culture | VOL. 1
Muakibatul Hasanah, et. al.Muakibatul Hasanah ... Risa Yanuarti Sholihah
27 Nov 2017
ISLLAC : Journal of Intensive Studies on Language, Literature, Art, and Culture | VOL. 1

A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance
Hongxia Lu ... Cyril Rakovski
BMC Medical Research Methodology | VOL. 22
Hongxia Lu, et. al.Hongxia Lu ... Cyril Rakovski
02 Jul 2022
BMC Medical Research Methodology | VOL. 22

Chinese Named Entity Recognition Using the Improved Transformer Encoder and the Lexicon Adapter
Mingjie Sun ... Yuhua Huang
-
Mingjie Sun, et. al.Mingjie Sun ... Yuhua Huang
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Combining Multiple Text Representations for Improved Automatic Evaluation of Indonesian Essay Answers

Abstract

Published Version

Talk to us

Similar Papers

More From: Scientific Journal of Informatics