Abstract

Educational automatic question generation (AQG) is often unable to realize its full potential in educational applications due to insufficient training data. For this reason, current research relies on noneducational question answering datasets for system training and evaluation. However, noneducational training data may comprise different language patterns than educational data. Consequently, the research question of whether models trained on noneducational datasets transfer well to the educational AQG task arises. In this work, we investigate the AQG subtask of answer selection, which aims to extract meaningful answers for the questions to be generated. We train and evaluate six modern and well-established BERT-based machine learning model architectures on two widely used noneducational datasets. Furthermore, we introduce a novel, midsized educational dataset for answer selection called TQA-A, which is used to investigate the transfer capabilities of the noneducational models to the educational domain. In terms of phrase-level evaluation metrics, noneducational models perform similar to models trained directly on the novel educational TQA-A dataset, although trained with considerably more training data. Moreover, models trained directly on TQA-A select fewer named entity-based and more verb-based answers than noneducational models. This provides evidence for differences in noneducational and educational answer selection tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call