Visual Question Generation From Remote Sensing Images

Laila Bashmal,Mohamad M Al Rahhal,Yakoub Bazi,Farid Melgani,Riccardo Ricci,Mansour Zuair

doi:10.1109/jstars.2023.3261361

Laila Bashmal, Mohamad M Al Rahhal + Show 4 more

Open Access

PDF Available

https://doi.org/10.1109/jstars.2023.3261361

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Visual question generation (VQG) is a fundamental task in vision-language understanding that aims to generate relevant questions about the given input image. In this paper, we propose a paragraph-based VQG approach for generating intelligent questions in natural language about remote sensing (RS) images. Specifically, our proposed framework consists of two transformer-based vision and language models. First, we employ a swin-transformer encoder to generate a multi-scale representative visual feature from the image. Then, this feature is used as a prefix to guide a generative pre-trained transformer-2 (GPT-2) decoder in generating multiple questions in the form of a paragraph to cover the abundant visual information contained in the RS scene. To train the model, the language decoder is fine-tuned on RS dataset to generate a set of relevant questions from the RS image. We evaluate our model on two visual question-answering (VQA) datasets in RS. Additionally, we construct a new dataset termed TextRS-VQA for better evaluation for our VQG model. This dataset consists of questions completely annotated by humans which addresses the high redundancy of the questions in prior VQA datasets. Extensive experiments using several accuracy and diversity metrics demonstrate the effectiveness of our proposed VQG model in generating meaningful, valid, and diverse questions from RS images. The TextRS-VQA dataset is available at: <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/yakoubbazi/TextRS</uri> .

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing	Publication Date: Jan 1, 2023
Citations: 11	License type: CC BY 4.0

R Discovery Prime

Visual Question Generation From Remote Sensing Images

Abstract

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

Lead the way for us

Similar Papers

Multi-view Visual Question Answering Dataset for Real Environment Applications
Yue Qiu ... Ryota Suzuki
-
Yue Qiu, et. al.Yue Qiu ... Ryota Suzuki
01 Jan 2020
01 Jan 2020

VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
Chuang Gan ... Yandong Li
-
Chuang Gan, et. al.Chuang Gan ... Yandong Li
01 Oct 2017
01 Oct 2017

VizWiz Grand Challenge: Answering Visual Questions from Blind People
Danna Gurari ... Abigale J Stangl
-
Danna Gurari, et. al.Danna Gurari ... Abigale J Stangl
01 Jun 2018
01 Jun 2018

Visual Question Answering Using Deep Learning: A Survey and Performance Analysis
Yash Srivastava ... Shiv Ram Dubey
-
Yash Srivastava, et. al.Yash Srivastava ... Shiv Ram Dubey
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Visual Question Generation From Remote Sensing Images

Abstract

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing