A Spatial Hierarchical Reasoning Network for Remote Sensing Visual Question Answering

Zixiao Zhang,Xu Liu,Fang Liu,Zhicheng Guo,Licheng Jiao,Lingling Li,Puhua Chen,Yuxuan Li

doi:10.1109/tgrs.2023.3237606

Abstract

For visual question answering on remote sensing (RSVQA), current methods scarcely consider geospatial objects typically with large-scale differences and positional sensitive properties. Besides, modeling and reasoning the relationships between entities have rarely been explored, which leads to one-sided and inaccurate answer predictions. In this article, a novel method called spatial hierarchical reasoning network (SHRNet) is proposed, which endows a remote sensing (RS) visual question answering (VQA) system with enhanced visual–spatial reasoning capability. Specifically, a hash-based spatial multiscale visual representation module is first designed to encode multiscale visual features embedded with spatial positional information. Then, spatial hierarchical reasoning is conducted to learn the high-order inner group object relations across multiple scales under the guidance of linguistic cues. Finally, a visual-question (VQ) interaction module is employed to learn an effective image–text joint embedding for the final answer predicting. Experimental results on three public RS VQA datasets confirm the effectiveness and superiority of our model SHRNet.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Spatial Hierarchical Reasoning Network for Remote Sensing Visual Question Answering

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on geoscience and remote sensing : a publication of the IEEE Geoscience and Remote Sensing Society

Lead the way for us

Journal: IEEE transactions on geoscience and remote sensing : a publication of the IEEE Geoscience and Remote Sensing Society	Publication Date: Jan 1, 2023
Citations: 6

Similar Papers

Automated Question Generation and Answer Verification Using Visual Data
Shrey Nahar ... Saumya Shah
-
Shrey Nahar, et. al.Shrey Nahar ... Saumya Shah
01 Jan 2020
01 Jan 2020

Improving Visual Question Answering by Referring to Generated Paragraph Captions
Hyounghun Kim ... Mohit Bansal
-
Hyounghun Kim, et. al.Hyounghun Kim ... Mohit Bansal
01 Jan 2019
01 Jan 2019

MobiVQA
Qingqing Cao ... Nicholas D. Lane
Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies | VOL. 6
Qingqing Cao, et. al.Qingqing Cao ... Nicholas D. Lane
04 Jul 2022
Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies | VOL. 6

Multi-Modality Latent Interaction Network for Visual Question Answering
Gao Peng ... Zhanpeng Zhang
-
Gao Peng, et. al.Gao Peng ... Zhanpeng Zhang
01 Oct 2019
01 Oct 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Spatial Hierarchical Reasoning Network for Remote Sensing Visual Question Answering

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on geoscience and remote sensing : a publication of the IEEE Geoscience and Remote Sensing Society