Multi-view Visual Question Answering Dataset for Real Environment Applications

Yue Qiu,Yutaka Satoh,Ryota Suzuki,Kenji Iwata

doi:10.1007/978-3-030-50334-5_26

Abstract

In this paper, we propose a novel large scale Visual Question Answering (VQA) dataset, which aims at real environment applications. Existing VQA datasets either require high constructing labor costs or have only limited power for evaluating complicated scene understanding ability involving in VQA tasks. Moreover, most VQA datasets do not tackle scenes containing object occlusion, which could be crucial for real-world applications. In this work, we propose a synthetic multi-view VQA dataset along with a dataset generation process. We build our dataset from three real object model datasets. Each scene is observed from multiple virtual cameras, which often requires a multi-view scene understanding. Our dataset requires relatively low labor cost and in the meantime, have highly complicated visual information. In addition, our dataset can be further adapted to users’ requirements by extending the dataset setup. We evaluated two previous multi-view VQA methods on our datasets. The results show that both 3D understanding and appearance understanding is crucial to achieving high performance in our dataset, and there is still room for future methods to improve. Our dataset provides a possible way for bridging the VQA methods aiming at CG dataset with real-world applications, such as robot picking tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-view Visual Question Answering Dataset for Real Environment Applications

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

VizWiz Grand Challenge: Answering Visual Questions from Blind People
Danna Gurari ... Chi Lin
-
Danna Gurari, et. al.Danna Gurari ... Chi Lin
01 Jun 2018
01 Jun 2018

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal ... Douglas Summers-Stay
International Journal of Computer Vision | VOL. 127
Yash Goyal, et. al.Yash Goyal ... Douglas Summers-Stay
11 Sep 2018
International Journal of Computer Vision | VOL. 127

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal ... Douglas Summers-Stay
-
Yash Goyal, et. al.Yash Goyal ... Douglas Summers-Stay
01 Jul 2017
01 Jul 2017

Lightweight Visual Question Answering using Scene Graphs
Sai Vidyaranya Nuthalapati ... Bowen Li
-
Sai Vidyaranya Nuthalapati, et. al.Sai Vidyaranya Nuthalapati ... Bowen Li
26 Oct 2021
26 Oct 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-view Visual Question Answering Dataset for Real Environment Applications

Abstract

Talk to us

Similar Papers