Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

Gen Luo,Xiaoshuai Sun,Cheng Deng,Yiyi Zhou,Rongrong Ji,Liujuan Cao,Chenglin Wu

doi:10.1109/cvpr42600.2020.01005

Abstract

Referring expression comprehension (REC) and segmentation (RES) are two highly-related tasks, which both aim at identifying the referent according to a natural language expression. In this paper, we propose a novel Multi-task Collaborative Network (MCN) to achieve a joint learning of REC and RES for the first time. In MCN, RES can help REC to achieve better language-vision alignment, while REC can help RES to better locate the referent. In addition, we address a key challenge in this multi-task setup, i.e., the prediction conflict, with two innovative designs namely, Consistency Energy Maximization (CEM) and Adaptive Soft Non-Located Suppression (ASNLS). Specifically, CEM enables REC and RES to focus on similar visual regions by maximizing the consistency energy between two tasks. ASNLS supresses the response of unrelated regions in RES based on the prediction of REC. To validate our model, we conduct extensive experiments on three benchmark datasets of REC and RES, i.e., RefCOCO, RefCOCO+ and RefCOCOg. The experimental results report the significant performance gains of MCN over all existing methods, i.e., up to +7.13% for REC and +11.50% for RES over SOTA, which well confirm the validity of our model for joint REC and RES learning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Multiple Relational Learning Network for Joint Referring Expression Comprehension and Segmentation
Guoguang Hua ... Yuhang Zhang
IEEE Transactions on Multimedia | VOL. 25
Guoguang Hua, et. al.Guoguang Hua ... Yuhang Zhang
01 Jan 2023
IEEE Transactions on Multimedia | VOL. 25

A Real-Time Global Inference Network for One-Stage Referring Expression Comprehension.
Yiyi Zhou ... Gen Luo
IEEE Transactions on Neural Networks and Learning Systems | VOL. 34
Yiyi Zhou, et. al.Yiyi Zhou ... Gen Luo
01 Jan 2023
IEEE Transactions on Neural Networks and Learning Systems | VOL. 34

Language-Attention Modular-Network for Relational Referring Expression Comprehension in Videos
Naina Dhingra ... Shipra Jain
-
Naina Dhingra, et. al.Naina Dhingra ... Shipra Jain
21 Aug 2022
21 Aug 2022

Continual Referring Expression Comprehension via Dual Modular Memorization.
Heng Tao Shen ... Cheng Chen
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society | VOL. 31
Heng Tao Shen, et. al.Heng Tao Shen ... Cheng Chen
01 Jan 2021
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

Abstract

Talk to us

Similar Papers