Dilated high-resolution network driven RGB-T multi-modal crowd counting

Zhengyi Liu,Yacheng Tan,Wei Wu,Bin Tang

doi:10.1016/j.image.2022.116915

Abstract

Crowd counting aims to estimate the number of pedestrians in a scene. However, the problems of insufficient illumination and large-scale variation affect the accuracy of crowd counting. In this paper, a dilated high-resolution network (DHRNet) driven RGB-T multi-modal crowd counting model is proposed to address the above problems. In terms of the importance of RGB and thermal modalities, a thermal-main and RGB-auxiliary strategy is chosen instead of treating them equally as in previous works. In terms of the fusion of RGB and thermal modalities, a cross-modal fusion module is designed and embedded in the input feature level of DHRNet. In terms of DHRNet output feature utilization, a multilayer perceptron regression head is proposed to predict high-quality density maps. The experimental results on public datasets show that our proposed network significantly outperforms state-of-the-art methods and reaches a new level of performance. The ablation studies verify the effectiveness of the thermal-main and RGB-auxiliary strategy and the proposed modules.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Dilated high-resolution network driven RGB-T multi-modal crowd counting

Abstract

Talk to us

Similar Papers

More From: Signal Processing: Image Communication

Lead the way for us

Journal: Signal Processing: Image Communication	Publication Date: Dec 23, 2022
Citations: 8

Similar Papers

CSA-Net: Cross-modal scale-aware attention-aggregated network for RGB-T crowd counting
He Li ... Yuguang Shao
Expert Systems with Applications | VOL. 213
He Li, et. al.He Li ... Yuguang Shao
20 Oct 2022
Expert Systems with Applications | VOL. 213

Online Learning Samples and Adaptive Recovery for Robust RGB-T Tracking
Jun Liu ... Zhongqiang Luo
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 34
Jun Liu, et. al.Jun Liu ... Zhongqiang Luo
01 Feb 2024
IEEE Transactions on Circuits and Systems for Video Technology | VOL. 34

Consistency-constrained RGB-T crowd counting via mutual information maximization
Qiang Guo ... Yangdong Ye
Complex & Intelligent Systems | VOL. 10
Qiang Guo, et. al.Qiang Guo ... Yangdong Ye
15 Apr 2024
Complex & Intelligent Systems | VOL. 10

Why Existing Multimodal Crowd Counting Datasets Can Lead to Unfulfilled Expectations in Real-World Applications
Martin Thißen ... Elke Hergenröther
-
Martin Thißen, et. al.Martin Thißen ... Elke Hergenröther
01 Jul 2023
01 Jul 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Dilated high-resolution network driven RGB-T multi-modal crowd counting

Abstract

Talk to us

Similar Papers

More From: Signal Processing: Image Communication