Abstract
Visible-infrared person re-identification (VI-ReID) aims to retrieve the person images of the same identity from the RGB to infrared image space, which is very important for real-world surveillance system. In practice, VI-ReID is more challenging due to the heterogeneous modality discrepancy, which further aggravates the challenges of traditional single-modality person ReID problem, i.e., inter-class confusion and intra-class variations. In this paper, we propose an aggregated memory-based cross-modality deep metric learning framework, which benefits from the increasing number of learned modality-aware and modality-agnostic centroid proxies for cluster contrast and mutual information learning. Furthermore, to suppress the modality discrepancy, the proposed cross-modality alignment objective simultaneously utilizes both historical and up-to-date learned cluster proxies for enhanced cross-modality association. Such training mechanism helps to obtain hard positive references through increased diversity of learned cluster proxies, and finally achieves stronger ``pulling close'' effect between cross-modality image features. Extensive experiment results demonstrate the effectiveness of the proposed method, surpassing state-of-the-art works significantly by a large margin on the commonly used VI-ReID datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Proceedings of the AAAI Conference on Artificial Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.