Abstract

Visible-infrared person re-identification (VI-ReID) aims to search the same pedestrian images across different modalities, which is a challenging task for video surveillance. Compared to RGB-based re-identification (Re-ID) with sufficient single-modality training samples, VI-ReID suffers from imbalanced dual-modality data which affects the accuracy of deep learning classifiers. To this end, we present a image modality translation (IMT) network that learns to generate translated modality images from given modalities. It performs image modality translation by means of cycle-consistent adversarial network (CycleGAN) and serves as a data augmentation tool to restore balance to imbalanced training images. Concretely, our method mainly includes two steps: first, we train the IMT network on real images and generate target modality samples to enlarge the training dataset size and increase its diversity. Then the source images and modality transferred images are combined to train a Re-ID CNN model for improving cross-modality retrieval performance. To validate the effectiveness of our proposed approach, we perform our work over SYSU-MM01 and RegDB datasets. The experimental results indicate that our proposed method is significantly more accurate than the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call