TCNet: Co-Salient Object Detection via Parallel Interaction of Transformers and CNNs

Yanliang Ge,Tian-Zhu Xiang,Hongbo Bi,Cong Zhang,Qiao Zhang

doi:10.1109/tcsvt.2022.3225865

Abstract

The purpose of co-salient object detection (CoSOD) is to detect the salient objects that co-occur in a group of relevant images. CoSOD has been significantly prospered by recent advances in <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">convolutional neural networks</i> (CNNs). However, it shows general limitations in modeling long-range feature dependencies, which is crucial for CoSOD. In the vision transformer, the self-attention mechanism is utilized to capture global dependencies but unfortunately destroy local spatial details, which are also essential for CoSOD. To address the above issues, we propose a dual network structure, called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TCNet</i> , which can efficiently excavate both local information and global representations for co-saliency learning via the parallel interaction of Transformers and CNNs. Specifically, it contains three critical components, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e</i> ., the mutual consensus module (MCM), the consensus complementary module (CCM), and the group consistent progressive decoder (GCPD). MCM aims to capture the global consensus from high-level features of these two branches as a guide for the following integration of consensus cues of both branches at each level. Next, CCM is designed to effectively fuse the consensus of local information and global contexts from different levels of the two branches. Finally, GCPD is developed to maintain group feature consistency and predict accurate co-saliency maps. The proposed <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">TCNet</i> is evaluated on five challenging CoSOD benchmark datasets using six widely used metrics, showing that our proposed method is superior to other existing cutting-edge methods for co-salient object detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

TCNet: Co-Salient Object Detection via Parallel Interaction of Transformers and CNNs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology

Lead the way for us

Journal: IEEE Transactions on Circuits and Systems for Video Technology	Publication Date: Jun 1, 2023
Citations: 10

Similar Papers

P2FEViT: Plug-and-Play CNN Feature Embedded Hybrid Vision Transformer for Remote Sensing Image Classification
Guanqun Wang ... Shanghang Zhang
Remote Sensing | VOL. 15
Guanqun Wang, et. al.Guanqun Wang ... Shanghang Zhang
26 Mar 2023
Remote Sensing | VOL. 15

Hyper-S3NN: Spatial–spectral spiking neural network for hyperspectral image classification
Jiangyun Li ... Tianxiang Zhang
Infrared Physics and Technology | VOL. 138
Jiangyun Li, et. al.Jiangyun Li ... Tianxiang Zhang
28 Feb 2024
Infrared Physics and Technology | VOL. 138

Remote Sensing Image Dehazing via a Local Context-Enriched Transformer
Jing Nie ... Hanqing Sun
Remote Sensing | VOL. 16
Jing Nie, et. al.Jing Nie ... Hanqing Sun
17 Apr 2024
Remote Sensing | VOL. 16

MAXFormer: Enhanced transformer for medical image segmentation with multi-attention and multi-scale features fusion
Zhiwei Liang ... Yiping Zhou
Knowledge-Based Systems | VOL. 280
Zhiwei Liang, et. al.Zhiwei Liang ... Yiping Zhou
16 Sep 2023
Knowledge-Based Systems | VOL. 280

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TCNet: Co-Salient Object Detection via Parallel Interaction of Transformers and CNNs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology