Referring Expression Comprehension by Composing Semantic-based Visual Attention

Zheng-An Zhu,Chen-Kuo Chiang,Hsuan-Lun Chiang

doi:10.1109/icce-taiwan55306.2022.9869171

Referring Expression Comprehension by Composing Semantic-based Visual Attention

Zheng-An Zhu, Chen-Kuo Chiang + Show 1 more

https://doi.org/10.1109/icce-taiwan55306.2022.9869171

Copy DOI

Publication Date: Jul 6, 2022

Affiliation: National Chung Cheng University

#Referring Expression Comprehension #Natural Language Sentence + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

In recent years, multimodal tasks are receiving more and more attentions. Referring expression comprehension aims to find the object by a natural language sentence. The challenge of this task is that the system should be able to recognize images and understand the text to determine the corresponding target. In this paper, an interpretable method is proposed to perform referring expression comprehension by decomposing the input sentences into semantic units. The proposed method learns the joint text and visual features and generates the attention map of the target object. Visualization of comprehension progress can be obtained in the end to further analyze how deep model interpret the query object.

Full Text