Visual question answering model based on visual relationship detection

Yuling Xi,Yanning Zhang,Songtao Ding,Shaohua Wan

doi:10.1016/j.image.2019.115648

Abstract

visual question answering (VQA) is a learning task involving two major fields of computer vision and natural language processing. The development of deep learning technology has contributed to the advancement of this research area. Although the research on the question answering model has made great progress, the low accuracy of the VQA model is mainly because the current question answering model structure is relatively simple, the attention mechanism of model is deviated from human attention and lacks a higher level of logical reasoning ability. In response to the above problems, we propose a VQA model based on multi-objective visual relationship detection. Firstly, the appearance feature is used to replace the image features from the original object, and the appearance model is extended by the principle of word vector similarity. The appearance features and relationship predicates are then fed into the word vector space and represented by a fixed length vector. Finally, through the concatenation of elements between the image feature and the question vector are fed into the classifier to generate an output answer. Our method is benchmarked on the DQAUAR data set, and evaluated by the Acc WUPS@0.0 and WUPS@0.9.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Visual question answering model based on visual relationship detection

Abstract

Talk to us

Similar Papers

More From: Signal Processing: Image Communication

Lead the way for us

Journal: Signal Processing: Image Communication	Publication Date: Sep 27, 2019
Citations: 62

Similar Papers

Estimation of Visual Contents from Human Brain Signals via VQA Based on Brain-Specific Attention
Ryo Shichida ... Takahiro Ogawa
-
Ryo Shichida, et. al.Ryo Shichida ... Takahiro Ogawa
04 Jun 2023
04 Jun 2023

Can Pre-training help VQA with Lexical Variations?
Shailza Jolly ... Shubham Kapoor
-
Shailza Jolly, et. al.Shailza Jolly ... Shubham Kapoor
01 Jan 2020
01 Jan 2020

Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
Aishwarya Agrawal ... Dhruv Batra
-
Aishwarya Agrawal, et. al.Aishwarya Agrawal ... Dhruv Batra
01 Jun 2018
01 Jun 2018

Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding.
Qingxing Cao ... Liang Lin
IEEE Transactions on Neural Networks and Learning Systems | VOL. 33
Qingxing Cao, et. al.Qingxing Cao ... Liang Lin
01 Jan 2020
IEEE Transactions on Neural Networks and Learning Systems | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Visual question answering model based on visual relationship detection

Abstract

Talk to us

Similar Papers

More From: Signal Processing: Image Communication