Abstract
Machine Reading Comprehension (MRC) is a challenging Natural Language Processing (NLP) research field with wide real-world applications. The great progress of this field in recent years is mainly due to the emergence of large-scale datasets and deep learning. At present, a lot of MRC models have already surpassed human performance on various benchmark datasets despite the obvious giant gap between existing MRC models and genuine human-level reading comprehension. This shows the need for improving existing datasets, evaluation metrics, and models to move current MRC models toward “real” understanding. To address the current lack of comprehensive survey of existing MRC tasks, evaluation metrics, and datasets, herein, (1) we analyze 57 MRC tasks and datasets and propose a more precise classification method of MRC tasks with 4 different attributes; (2) we summarized 9 evaluation metrics of MRC tasks, 7 attributes and 10 characteristics of MRC datasets; (3) We also discuss key open issues in MRC research and highlighted future research directions. In addition, we have collected, organized, and published our data on the companion website where MRC researchers could directly access each MRC dataset, papers, baseline projects, and the leaderboard.
Highlights
In the long history of Natural Language Processing (NLP), teaching computers to read the text and understand the meaning of the text is a major research goal that has not been fully realized
We conducted a comprehensive survey of recent efforts on the tasks, evaluation metrics, and benchmark datasets of machine reading comprehension (MRC)
The computing methods of different MRC evaluation metrics have been introduced with their usage in each type of MRC tasks analyzed
Summary
In the long history of Natural Language Processing (NLP), teaching computers to read the text and understand the meaning of the text is a major research goal that has not been fully realized. In order to accomplish this task, researchers have conducted machine reading comprehension (MRC) research in many aspects recently with the emergence of the large-scale datasets, higher computing power, and the deep learning techniques, which have boosted the whole NLP research [1,2,3]. The concept of MRC comes from the human understanding of text. The most common way to test whether a person can fully understand a piece of text is to require she/he answer questions about the text. Just like the human language test, reading comprehension is a natural way to evaluate a computer’s language understanding ability.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have