Answer-Aware Question Generation from Tabular and Textual Data using T5

Saichandra Pandraju,Sakthi Ganesh Mahalingam

doi:10.3991/ijet.v16i18.25121

Abstract

Automatic Question Generation (AQG) systems are applied in a myriad of domains to generate questions from sources such as documents, images, knowledge graphs to name a few. With the rising interest in such AQG systems, it is equally important to recognize structured data like tables while generating questions from documents. In this paper, we propose a single model architecture for question generation from tables along with text using “Text-to-Text Transfer Transformer” (T5) - a fully end-to-end model which does not rely on any intermediate planning steps, delexicalization, or copy mechanisms. We also present our systematic approach in modifying the ToTTo dataset, release the augmented dataset as TabQGen along with the scores achieved using T5 as a baseline to aid further research.

Highlights

The development of end-to-end supervised Question-Answering (QA) models has been accelerated with the advent of large-scale datasets
The Stanford Question Answering Dataset (SQUAD) [5] is a reading comprehension dataset composed of questions from Wikipedia articles, with the answer to each question being a part of the corresponding reading passage
We emphasize the need for Automatic Question Generation (AQG) systems to effectively utilize all the available data in source documents and propose an Answer-Aware Question Generation system using T5 to generate questions from both tabular and textual data

Summary

Introduction

The development of end-to-end supervised Question-Answering (QA) models has been accelerated with the advent of large-scale datasets. The Stanford Question Answering Dataset (SQUAD) [5] is a reading comprehension dataset composed of questions from Wikipedia articles, with the answer to each question being a part of the corresponding reading passage. Microsoft Machine Reading Comprehension (MS MARCO) [6] is a large-scale dataset focused on reading comprehension, question answering, passage ranking, Keyphrase Extraction, and Conversational Search Studies. TriviaQA [7] is a realistic text-based question-answer dataset with 950K question-answer pairings extracted from Wikipedia and the internet. Since the answers to questions may not be acquired via span prediction, TriviaQA is more challenging than traditional QA benchmark datasets such as SQuAD. DuoRC [8] comprises 186K distinct question-answer combinations derived from 7680 pairs of movie plots, each pair representing two different versions of the same film and highlights the challenges of combining knowledge and reasoning in neural architectures for reading comprehension

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Emerging Technologies in Learning (iJET)	Publication Date: Sep 20, 2021
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Answer-Aware Question Generation from Tabular and Textual Data using T5

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Emerging Technologies in Learning (iJET)

Lead the way for us

Similar Papers

Tapping into the Power of Automatic Question Generation
Amal Elsayed Aboutabl ...
International Journal of Computer Applications | VOL. 103
Amal Elsayed Aboutabl, et. al.Amal Elsayed Aboutabl ...
18 Oct 2014
International Journal of Computer Applications | VOL. 103

Thematic Question Generation over Knowledge Bases
Tanguy Raynaud ... Frederique Laforest
-
Tanguy Raynaud, et. al.Tanguy Raynaud ... Frederique Laforest
01 Dec 2018
01 Dec 2018

G-Asks: An Intelligent Automatic Question Generation System for Academic Writing Support
Ming Liu ... Rafael A Calvo
Dialogue & Discourse | VOL. 3
Ming Liu, et. al.Ming Liu ... Rafael A Calvo
16 Mar 2012
Dialogue & Discourse | VOL. 3

Towards a Better Metric for Evaluating Question Generation Systems
Preksha Nema ... Mitesh M Khapra
-
Preksha Nema, et. al.Preksha Nema ... Mitesh M Khapra
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Answer-Aware Question Generation from Tabular and Textual Data using T5

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Emerging Technologies in Learning (iJET)