CCANet: A Collaborative Cross-Modal Attention Network for RGB-D Crowd Counting

Yanbo Liu,Yingxiang Hu,Guo Cao,Boshan Shi

doi:10.1109/tmm.2023.3262978

Abstract

Presently, to obtain a more accurate density map and crowd number, existing methods often count by combining training RGB images and depth images. However, these methods are not ideal for capturing and fusing complementary features in RGB-D. Therefore, to solve the above problems, we propose a collaborative cross-modal attention network named CCANet for accurate RGB-D crowd counting. CCANet is mainly composed of the collaborative cross-modal attention module (CCAM) and the collaborative cross-modal fusion module (CCFM). Specifically, CCAM focuses on adaptive, interleaved RGB-D information through channel and spatial cross-modal attentions to fully capture complementary features in different modes. CCFM can adaptively integrate these features by weighing the importance of the above complementary features. A large number of experiments on the ShanghaiTechRGBD and MICC benchmarks have proven the effectiveness of CCANet in RGB-D crowd counting. In addition, our CCANet is generally applicable to multimodal crowd counting and has achieved superior counting performance on the RGBT-CC benchmark.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CCANet: A Collaborative Cross-Modal Attention Network for RGB-D Crowd Counting

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia

Lead the way for us

Journal: IEEE Transactions on Multimedia	Publication Date: Jan 1, 2024
Citations: 9

Similar Papers

Action Video Games Make Dyslexic Children Read Better
Sandro Franceschini ... Andrea Facoetti
Current Biology | VOL. 23
Sandro Franceschini, et. al.Sandro Franceschini ... Andrea Facoetti
28 Feb 2013
Current Biology | VOL. 23

Spatial and Cross-Modal Attention Alter Responses to Unattended Sensory Information in Early Visual and Auditory Human Cortex
Vivian M Ciaramitaro ... Giedrius T Buračas
Journal of Neurophysiology | VOL. 98
Vivian M Ciaramitaro, et. al.Vivian M Ciaramitaro ... Giedrius T Buračas
22 Aug 2007
Journal of Neurophysiology | VOL. 98

Spatial-Temporal Graph Network for Video Crowd Counting
Zhe Wu ... Geng Tian
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 33
Zhe Wu, et. al.Zhe Wu ... Geng Tian
01 Jan 2023
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 33

Dynamics of Within-, Inter-, and Cross-Modal Attentional Modulation
Tetsuo Kida ... Ryusuke Kakigi
Journal of Neurophysiology | VOL. 105
Tetsuo Kida, et. al.Tetsuo Kida ... Ryusuke Kakigi
08 Dec 2010
Journal of Neurophysiology | VOL. 105

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CCANet: A Collaborative Cross-Modal Attention Network for RGB-D Crowd Counting

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Multimedia