Text-augmented Multi-Modality contrastive learning for unsupervised visible-infrared person re-identification

Rui Sun,Guoxi Huang,Xuebin Wang,Yun Du,Xudong Zhang

doi:10.1016/j.imavis.2024.105310

Abstract

Visible-infrared person re-identification holds significant implications for intelligent security. Unsupervised methods can reduce the gap of different modalities without labels. Most previous unsupervised methods only train their models with image information, so that the model cannot obtain powerful deep semantic information. In this paper, we leverage CLIP to extract deep text information. We propose a Text-Image Alignment (TIA) module to align the image and text information and effectively bridge the gap between visible and infrared modality. We produce a Local-Global Image Match (LGIM) module to find homogeneous information. Specifically, we employ the Hungarian algorithm and Simulated Annealing (SA) algorithm to attain original information from image features while mitigating the interference of heterogeneous information. Additionally, we design a Changeable Cross-modality Alignment Loss (CCAL) to enable the model to learn modality-specific features during different training stages. Our method performs well and attains powerful robustness by targeted learning. Extensive experiments demonstrate the effectiveness of our approach, our method achieves a rank-1 accuracy that exceeds state-of-the-art approaches by approximately 10% on the RegDB.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Text-augmented Multi-Modality contrastive learning for unsupervised visible-infrared person re-identification

Abstract

Talk to us

Similar Papers

More From: Image and Vision Computing

Lead the way for us

Similar Papers

Imperceptible digital watermarking in medical retinal images for tele-medicine applications
Malay Kishore Dutta ... Carlos M Travieso
-
Malay Kishore Dutta, et. al.Malay Kishore Dutta ... Carlos M Travieso
01 Nov 2014
01 Nov 2014

Exploration of Solving Methods and Applications of Neural Network Models in Optimization Problems
Beichen Zhao
International Journal of Computer Science and Information Technology | VOL. 3
Beichen ZhaoBeichen Zhao
15 Jun 2024
International Journal of Computer Science and Information Technology | VOL. 3

DAE-Nest: A depth information extraction and enhancement fusion network for infrared and visible images
Peicheng Shi ... Rongyun Zhang
Optics Communications | VOL. 560
Peicheng Shi, et. al.Peicheng Shi ... Rongyun Zhang
07 Mar 2024
Optics Communications | VOL. 560

Method for hiding text data in an image
O.A Kan ... G.B Turebaeva
Bulletin of the Innovative University of Eurasia | VOL. 83
O.A Kan, et. al.O.A Kan ... G.B Turebaeva
23 Sep 2021
Bulletin of the Innovative University of Eurasia | VOL. 83

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text-augmented Multi-Modality contrastive learning for unsupervised visible-infrared person re-identification

Abstract

Talk to us

Similar Papers

More From: Image and Vision Computing