Abstract

Cross-modality person re-identification between the visible domain and infrared domain is important but extremely challenging for night-time surveillance. Besides the cross-modality discrepancies caused by different camera spectrums, visible infrared person re-identification (VI-REID) still suffers from much pedestrian misalignment as well as the variations caused by different camera viewpoints and various pedestrian pose deformations like traditional person re-identification. In this paper, we propose a multi-path adaptive pedestrian alignment network (MAPAN) to learn discriminative feature representations. The multi-path network learns features directly from the data in an end-to-end manner and aligns the pedestrians adaptively without any additional manual annotations. To alleviate the intra-modality discrepancies caused by image misalignment, we combine the aligned visible image features with the original visible image features and enhance the attention of the network towards pedestrians, resulting in significant improvements in distinguishability of the learning features. To mitigate the cross-modality discrepancies between the visible domain and the infrared domain, the discriminative features of the two modalities are mapped to the same feature embedding space, and the identity loss as well as triplet loss is incorporated as the overall loss. Extensive experiments demonstrate the superior performance of proposed method compared to the state-of-the-arts.

Highlights

  • INTRODUCTIONPerson Re-identification (known as ReID) is a technique in the field of computer vision to identify a specific pedestrian as (numerically) the same as one encountered on a previous occasion [1]

  • Person Re-identification is a technique in the field of computer vision to identify a specific pedestrian as the same as one encountered on a previous occasion [1]

  • We propose an end-to-end multi-path adaptive pedestrian alignment network(MAPAN) strategy to deal with the intra-modality discrepancies in misaligned images caused by the acquisition of the cross-modality dataset for the first time

Read more

Summary

INTRODUCTION

Person Re-identification (known as ReID) is a technique in the field of computer vision to identify a specific pedestrian as (numerically) the same as one encountered on a previous occasion [1]. We propose an end-to-end multi-path adaptive pedestrian alignment network(MAPAN) strategy to deal with the intra-modality discrepancies in misaligned images caused by the acquisition of the cross-modality dataset for the first time. Because the additional modality information captured by infrared camera was integrated with standard RGB visible images, the person re-identification performance was improved efficiently. [25] raised the visible infrared crossmodality re-identification(VI-REID) problem for the first time and contributed a large scale cross-modality pedestrian dataset SYSU-MM01 for VI-REID, and a deep zero-padding method was proposed that utilizes a one-stream network to capture information for a specific domain. For VI-REID, the lack of authentication information to re-identify the same person between visible domain and infrared domain, and the difficulty to learn a robust representation for such a large-scale cross-modality person retrieval are the two main challenges. There are more misalignment phenomena in the input visible images, and in order to obtain more robust visible features by affine transformation correction, we fuse the feature of visible base branch and the feature of affine transformation branch by weighted addition

INFRARED BRANCH AND VISIBLE BASE BRANCH
VISIBLE AFFINE TRANSFORMATION BRANCH
THE FEATURE EMBEDDING
THE OVERALL LOSS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.