What Linguistic Information Does Reading Comprehension Require?

Yong Guan,Ru Li,Shaoru Guo

doi:10.1007/978-981-16-1964-9_20

Abstract

Machine comprehension is one of the primary goals in Artificial Intelligence (AI) and Natural Language Processing (NLP). Accessing the difficulty level of machine reading comprehension (MRC) questions is important for building accurate MRC systems. In order to tackle this problem, we propose a novel idea to access the difficulty level of MRC questions, according to the amount of linguistic information required to answer them. Specifically, we systematically analyze and compare the performance for each BERT layer representation per question type on MRC datasets, and highlighted the characteristics of the datasets according to linguistic information of different layers. Our extensive analysis suggests that the superficial categories (or question types) of MRC questions do not directly reflect their difficulty levels and that it is possible to analyze the MRC questions’ difficulty levels according to the amount of linguistic information required.

Full Text