Identifying Where to Focus in Reading Comprehension for Neural Question Generation

Xinya Du,Claire Cardie

doi:10.18653/v1/d17-1219

Abstract

A first step in the task of automatically generating questions for testing reading comprehension is to identify question-worthy sentences, i.e. sentences in a text passage that humans find it worthwhile to ask questions about. We propose a hierarchical neural sentence-level sequence tagging model for this task, which existing approaches to question generation have ignored. The approach is fully data-driven — with no sophisticated NLP pipelines or any hand-crafted rules/features — and compares favorably to a number of baselines when evaluated on the SQuAD data set. When incorporated into an existing neural question generation system, the resulting end-to-end system achieves state-of-the-art performance for paragraph-level question generation for reading comprehension.

Highlights

Introduction and Related WorkAutomatically generating questions for testing reading comprehension is a challenging task (Mannem et al, 2010; Rus et al, 2010)
First and foremost, the question generation system must determine which concepts in the associated text passage are important, i.e. are worth asking a question about
Inspired by the large body of research in text summarization on identifying sentences that contain “summary-worthy” content (e.g. Mihalcea (2005), Berg-Kirkpatrick et al (2011), Yang et al (2017)), we develop a method to identify the question-worthy sentences in each paragraph of a reading comprehension passage

Summary

Introduction

Introduction and Related WorkAutomatically generating questions for testing reading comprehension is a challenging task (Mannem et al, 2010; Rus et al, 2010). Prior work focuses almost exclusively on sentence-level question generation: given a text passage, assume that all sentences contain a question-worthy concept and generate one or more questions for each (Heilman and Smith, 2010; Du et al, 2017; Zhou et al, 2017). Inspired further by the success of neural sequence models for many natural language processing tasks (e.g. named entity recognition (Collobert et al, 2011), sentiment classification (Socher et al, 2013), machine translation (Sutskever et al, 2014), dependency parsing (Chen and Manning, 2014)), including very recently document-level text summarization (Cheng and Lapata, 2016), we propose a hierarchical neural sentence-level sequence tagging model for question-worthy sentence identification

Objectives

Methods

Results

Conclusion