Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks

Peng Wang,Anton Van Den Hengel,Jiewei Cao,Lianli Gao,Chunhua Shen,Qi Wu

doi:10.1109/cvpr.2019.00206

Abstract

The task in referring expression comprehension is to localise the object instance in an image described by a referring expression phrased in natural language. As a language-to-vision matching task, the key to this problem is to learn a discriminative object feature that can adapt to the expression used. To avoid ambiguity, the expression normally tends to describe not only the properties of the referent itself, but also its relationships to its neighbourhood. To capture and exploit this important information we propose a graph-based, language-guided attention mechanism. Being composed of node attention component and edge attention component, the proposed graph attention mechanism explicitly represents inter-object relationships, and properties with a flexibility and power impossible with competing approaches. Furthermore, the proposed graph attention mechanism enables the comprehension decision to be visualisable and explainable. Experiments on three referring expression comprehension datasets show the advantage of the proposed approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
Gen Luo ... Chenglin Wu
-
Gen Luo, et. al.Gen Luo ... Chenglin Wu
01 Jun 2020
01 Jun 2020

Referring Expression Comprehension Via Enhanced Cross-modal Graph Attention Networks
Jia Wang ... Wen-Huang Cheng
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. 19
Jia Wang, et. al.Jia Wang ... Wen-Huang Cheng
06 Feb 2023
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. 19

Language-Attention Modular-Network for Relational Referring Expression Comprehension in Videos
Naina Dhingra ... Shipra Jain
-
Naina Dhingra, et. al.Naina Dhingra ... Shipra Jain
21 Aug 2022
21 Aug 2022

A Real-Time Cross-Modality Correlation Filtering Method for Referring Expression Comprehension
Yue Liao ... Chen Qian
-
Yue Liao, et. al.Yue Liao ... Chen Qian
01 Jun 2020
01 Jun 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks

Abstract

Talk to us

Similar Papers