Comparative Analysis of Neural QA models on SQuAD

Soumya Wadhwa,Khyathi Chandu,Eric Nyberg

doi:10.18653/v1/w18-2610

Abstract

The task of Question Answering has gained prominence in the past few decades for testing the ability of machines to understand natural language. Large datasets for Machine Reading have led to the development of neural models that cater to deeper language understanding compared to information retrieval tasks. Different components in these neural architectures are intended to tackle different challenges. As a first step towards achieving generalization across multiple domains, we attempt to understand and compare the peculiarities of existing end-to-end neural models on the Stanford Question Answering Dataset (SQuAD) by performing quantitative as well as qualitative analysis of the results attained by each of them. We observed that prediction errors reflect certain model-specific biases, which we further discuss in this paper.

Highlights

Machine Reading is a task in which a model reads a piece of text and attempts to formally represent it or performs a downstream task like Question Answering (QA)
We focused on Bi-Directional Attention Flow (BiDAF) (Seo et al, 2016), Gated Self-Matching Networks (R-Net) (Wang et al, 2017), Document Reader (DrQA) (Chen et al, 2017), MultiParagraph Reading Comprehension (DocQA) (Clark and Gardner, 2017), and the Logistic Regression baseline model (Rajpurkar et al, 2016) We mainly choose these models since they have comparable high performance on the evaluation metrics and it is easy to replicate their results due to availability of open source implementations
We analyze - both quantitatively and qualitatively - results generated by 4 end-to-end neural models on the Stanford Question Answering Dataset

Summary

Introduction

Machine Reading is a task in which a model reads a piece of text and attempts to formally represent it or performs a downstream task like Question Answering (QA). Neural approaches to the latter have gained a lot of prominence especially owing to the recent spur in developing and publicly releasing large datasets on Machine Reading and Comprehension (MRC) These datasets are created from different underlying sources such as web resources in MS MARCO (Nguyen et al, 2016); trivia and web in QUASAR-S and QUASAR-T (Dhingra et al, 2017), SearchQA (Dunn et al, 2017), TriviaQA (Joshi et al, 2017); news articles in CNN/Daily Mail (Chen et al.), NewsQA (Trischler et al, 2016) and stories in NarrativeQA (Kociskyet al., 2017).

Relevant Neural Models

Span-Level Performance

Sentence-Level Performance

Passage Length Distribution

Question Length Distribution

Answer Length Distribution

Error Overlap

Inference-Based Errors

Qualitative Analysis

Boundary-Based Errors

Findings

Observations

Conclusion and Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparative Analysis of Neural QA models on SQuAD

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2018
Citations: 16	License type: cc-by

Similar Papers

Investigating Query Expansion and Coreference Resolution in Question Answering on BERT
Santanu Bhattacharjee ... Gideon Maillette De Buy Wenniger
-
Santanu Bhattacharjee, et. al.Santanu Bhattacharjee ... Gideon Maillette De Buy Wenniger
01 Jan 2020
01 Jan 2020

Can Knowledge Enhance Reading Comprehension? An Integrated Approach with Semantic Lexicon
Kong Wei Kun ... Teeradaj Racharak
-
Kong Wei Kun, et. al.Kong Wei Kun ... Teeradaj Racharak
12 Nov 2020
12 Nov 2020

Evaluating Surprise Adequacy for Question Answering
Seah Kim ... Shin Yoo
-
Seah Kim, et. al.Seah Kim ... Shin Yoo
27 Jun 2020
27 Jun 2020

Feasibility of Using Zero-Shot Learning in Transformer-Based Natural Language Processing Algorithm for Key Information Extraction from Head and Neck Tumor Board Notes
S Zhu ... K Thind
International Journal of Radiation Oncology*Biology*Physics | VOL. 117
S Zhu, et. al.S Zhu ... K Thind
29 Sep 2023
International Journal of Radiation Oncology*Biology*Physics | VOL. 117

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparative Analysis of Neural QA models on SQuAD

Abstract

Highlights

Summary

Talk to us

Similar Papers