A novel two-stream saliency image fusion CNN architecture for person re-identification

Fuqing Zhu,Qi Tian,Haiyan Fu,Xiangwei Kong

doi:10.1007/s00530-017-0583-4

Abstract

Background interference, which arises from complex environment, is a critical problem for a robust person re-identification (re-ID) system. The background noise may significantly compromise the feature learning and matching process. To reduce the background interference, this paper proposes a saliency image embedding as a pedestrian descriptor. First, to eliminate the background for each pedestrian image, the saliency image is constructed, which is implemented through an unsupervised manifold ranking-based saliency detection algorithm. Second, to reduce some errors and details missing of pedestrian during the saliency image construction process, a saliency image fusion (SIF) convolutional neural network (CNN) architecture is well designed, in which the original pedestrian image and saliency image are both employed as input. We implement our idea in the identification models based on some state-of-the-art backbone CNN models (i.e., CaffeNet, VGGNet-16, GoogLeNet and ResNet-50). We show that the learned pedestrian descriptor by the proposed SIF CNN architecture provides a significant improvement over the baselines and produces a competitive performance compared with the state-of-the-art person re-ID methods on three large-scale person re-ID benchmarks (i.e., Market-1501, DukeMTMC-reID and MARS).

Full Text