Interpretable Visual Understanding with Cognitive Attention Network.

Xuejiao Tang,Kea Turner,Tyler Derr,Mengyu Wang,Eirini Ntoutsi,Yi Yu,Wenbin Zhang

doi:10.1007/978-3-030-86362-3_45

Abstract

While image understanding on recognition-level has achieved remarkable advancements, reliable visual scene understanding requires comprehensive image understanding on recognition-level but also cognition-level, which calls for exploiting the multi-source information as well as learning different levels of understanding and extensive commonsense knowledge. In this paper, we propose a novel Cognitive Attention Network (CAN) for visual commonsense reasoning to achieve interpretable visual understanding. Specifically, we first introduce an image-text fusion module to fuse information from images and text collectively. Second, a novel inference module is designed to encode commonsense among image, query and response. Extensive experiments on large-scale Visual Commonsense Reasoning (VCR) benchmark dataset demonstrate the effectiveness of our approach. The implementation is publicly available at https://github.com/tanjatang/CAN.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Interpretable Visual Understanding with Cognitive Attention Network.

Abstract

Talk to us

Similar Papers

More From: Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)

Lead the way for us

Journal: Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)	Publication Date: Jan 1, 2021
Citations: 2

Similar Papers

NeuSyRE: Neuro-symbolic visual understanding and reasoning framework based on scene graph enrichment
M Jaleed Khan ... Edward Curry
Semantic Web | VOL. -
M Jaleed Khan, et. al.M Jaleed Khan ... Edward Curry
13 Dec 2023
Semantic Web | VOL. -

Common Sense Knowledge Infusion for Visual Understanding and Reasoning: Approaches, Challenges, and Applications
Muhammad Jaleed Khan ... John G Breslin
IEEE Internet Computing | VOL. 26
Muhammad Jaleed Khan, et. al.Muhammad Jaleed Khan ... John G Breslin
01 Jul 2022
IEEE Internet Computing | VOL. 26

Explicit Cross-Modal Representation Learning for Visual Commonsense Reasoning
Xi Zhang ... Feifei Zhang
IEEE Transactions on Multimedia | VOL. 24
Xi Zhang, et. al.Xi Zhang ... Feifei Zhang
01 Jan 2021
IEEE Transactions on Multimedia | VOL. 24

Cognitive Visual Commonsense Reasoning Using Dynamic Working Memory
Xuejiao Tang ... Wenbin Zhang
-
Xuejiao Tang, et. al.Xuejiao Tang ... Wenbin Zhang
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Interpretable Visual Understanding with Cognitive Attention Network.

Abstract

Talk to us

Similar Papers

More From: Artificial neural networks, ICANN : international conference ... proceedings. International Conference on Artificial Neural Networks (European Neural Network Society)