Leveraging Structured Information from a Passage to Generate Questions

Jian Xu,Mingtao Zhou,Jianhou Gan,Di Wu,Yu Sun

doi:10.26599/tst.2022.9010034

Abstract

Question Generation (QG) is the task of utilizing Artificial Intelligence (Al) technology to generate questions that can be answered by a span of text within a given passage. Existing research on QG in the educational field struggles with two challenges: the mainstream QG models based on seq-to-seq fail to utilize the structured information from the passage; the other is the lack of specialized educational QG datasets. To address the challenges, a specialized QG dataset, reading comprehension dataset from examinations for QG (named RACE4QG), is reconstructed by applying a new answer tagging approach and a data-filtering strategy to the RACE dataset. Further, an end-to-end QG model, which can exploit the intra- and inter-sentence information to generate better questions, is proposed. In our model, the encoder utilizes a Gated Recurrent Units (GRU) network, which takes the concatenation of word embedding, answer tagging, and Graph Attention neTworks(GAT) embedding as input. The hidden states of the GRU are operated with a gated self-attention to obtain the final passage-answer representation, which will be fed to the decoder. Results show that our model outperforms baselines on automatic metrics and human evaluation. Consequently, the model improves the baseline by 0.44, 1.32, and 1.34 on BLEU-4, ROUGE-L, and METEOR metrics, respectively, indicating the effectivity and reliability of our model. Its gap with human expectations also reflects the research potential.

Full Text