Abstract
Visible-infrared person re-identification (VI-ReID) aims to match persons across visible and infrared modalities; however, its performance is prone to complex dynamic scenes, such as occlusions, background shifts, and pose changes. In this paper, we propose a Multi-scale Dynamic Fusion Network (MDFN) to address these challenges in the VI-ReID task. Specifically, the proposed MDFN consists of the Dynamic Feature Fusion (DFF), Dynamic Perception Enhancement (DPE), and Feature Reweighting with Similarity (FRS) modules. The DFF module dynamically extracts local and long-range dependencies among features to obtain finer-grained discriminative features. The DPE module extracts multi-scale features from both visible and infrared modalities to generate diverse embeddings. The FRS module mitigates the impact of information imbalance between modalities, thereby further improving performance. Extensive experiments on the SYSU-MM01 and RegDB datasets show that our MDFN outperforms other state-of-the-art methods, especially in complex dynamic scenes with occlusions, background shifts, and pose changes.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have