Prompt–RSVQA: Prompting visual context to a language model for Remote Sensing Visual Question Answering

Christel Chappuis,Devis Tuia,Valerie Zermatten,Bertrand Le Saux,Sylvain Lobry

doi:10.1109/cvprw56347.2022.00143

Christel Chappuis, Devis Tuia + Show 3 more

Open Access

https://doi.org/10.1109/cvprw56347.2022.00143

Copy DOI

Abstract

Remote sensing visual question answering (RQA) was recently proposed with the aim of interfacing natural language and vision to ease the access of information contained in Earth Observation data for a wide audience, which is granted by simple questions in natural language. The traditional vision/language interface is an embedding obtained by fusing features from two deep models, one processing the image and another the question. Despite the success of early VQA models, it remains difficult to control the adequacy of the visual information extracted by its deep model, which should act as a context regularizing the work of the language model. We propose to extract this context information with a visual model, convert it to text and inject it, i.e. prompt it, into a language model. The language model is therefore responsible to process the question with the visual context, and extract features, which are useful to find the answer. We study the effect of prompting with respect to a black-box visual extractor and discuss the importance of training a visual model producing accurate context.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Prompt–RSVQA: Prompting visual context to a language model for Remote Sensing Visual Question Answering

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Linguistic-Based SPARQL Translation Model for Semantic Question Answering System
Sofian Hazrina ... Nurfadhlina Mohd Sharef
Advanced Science Letters | VOL. 24
Sofian Hazrina, et. al.Sofian Hazrina ... Nurfadhlina Mohd Sharef
01 Feb 2018
Advanced Science Letters | VOL. 24

Prior Visual Relationship Reasoning For Visual Question Answering
Zhuoqian Yang ... Zengchang Qin
-
Zhuoqian Yang, et. al.Zhuoqian Yang ... Zengchang Qin
01 Oct 2020
01 Oct 2020

ExQuestions: An Expanded Factual Corpus for Question Answering over Knowledge Graphs
Wellington Franco ... Javam Machado
-
Wellington Franco, et. al.Wellington Franco ... Javam Machado
01 Jan 2021
01 Jan 2021

Visual Question Generation From Remote Sensing Images
Laila Bashmal ... Riccardo Ricci
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 16
Laila Bashmal, et. al.Laila Bashmal ... Riccardo Ricci
01 Jan 2023
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prompt–RSVQA: Prompting visual context to a language model for Remote Sensing Visual Question Answering

Abstract

Talk to us

Similar Papers