Consistency-constrained RGB-T crowd counting via mutual information maximization

Qiang Guo,Pengcheng Yuan,Xiangming Huang,Yangdong Ye

doi:10.1007/s40747-024-01427-x

Abstract

The incorporation of thermal imaging data in RGB-T images has demonstrated its usefulness in cross-modal crowd counting by offering complementary information to RGB representations. Despite achieving satisfactory results in RGB-T crowd counting, many existing methods still face two significant limitations: (1) The oversight of the heterogeneous gap between modalities complicates the effective integration of multimodal features. (2) The absence of mining consistency hinders the full exploitation of the unique complementary strengths inherent in each modality. To this end, we present C4-MIM, a novel Consistency-constrained RGB-T Crowd Counting approach via Mutual Information Maximization. It effectively leverages multimodal information by learning the consistency between the RGB and thermal modalities, thereby enhancing the performance of cross-modal counting. Specifically, we first advocate extracting feature representations of different modalities in a shared encoder to moderate the heterogeneous gap since they obey the identical coding rules with shared parameters. Then, we intend to mine the consistent information of different modalities to better learn conducive information and improve the performance of feature representations. To this end, we formulate the complementarity of multimodality representations as a mutual information maximization regularizer to maximize the consistent information of different modalities, in which the consistency would be maximally attained before combining the multimodal information. Finally, we simply aggregate the feature representations of the different modalities and send them into a regressor to output the density maps. The proposed approach can be implemented by arbitrary backbone networks and is quite robust in the face of single modality unavailable or serious compromised. Extensively experiments have been conducted on the RGBT-CC and DroneRGBT benchmarks to evaluate the effectiveness and robustness of the proposed approach, demonstrating its superior performance compared to the SOTA approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Complex & Intelligent Systems	Publication Date: Apr 15, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Consistency-constrained RGB-T crowd counting via mutual information maximization

Abstract

Talk to us

Similar Papers

More From: Complex & Intelligent Systems

Lead the way for us

Similar Papers

Multimodal Mutual Information Maximization: A Novel Approach for Unsupervised Deep Cross-Modal Hashing.
Tuan Hoang ... Tam V Nguyen
IEEE Transactions on Neural Networks and Learning Systems | VOL. PP
Tuan Hoang, et. al.Tuan Hoang ... Tam V Nguyen
01 Sep 2023
IEEE Transactions on Neural Networks and Learning Systems | VOL. PP

Image Search with Text Feedback by Deep Hierarchical Attention Mutual Information Maximization
Chunbin Gu ... Jiajun Bu
-
Chunbin Gu, et. al.Chunbin Gu ... Jiajun Bu
17 Oct 2021
17 Oct 2021

Modality-specific Adaptive Scaling Method for Cross-modal Retrieval
Baitao Chen ... Xiao Ke
-
Baitao Chen, et. al.Baitao Chen ... Xiao Ke
28 Oct 2022
28 Oct 2022

Modality-specific adaptive scaling and attention network for cross-modal retrieval
Xiao Ke ... Weibin Chen
Neurocomputing | VOL. 612
Xiao Ke, et. al.Xiao Ke ... Weibin Chen
01 Jan 2025
Neurocomputing | VOL. 612

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Consistency-constrained RGB-T crowd counting via mutual information maximization

Abstract

Talk to us

Similar Papers

More From: Complex & Intelligent Systems