Abstract

In this paper, we propose to learn deep features from body and parts (DFBP) in camera networks which combine the advantages of part-based and body-based features. Specifically, we utilize subregion pairs to train the part-based feature learning model and predict whether they belong to positive subregion pairs. Meanwhile, we utilize holistic pedestrian images to train body-based feature learning model and predict the identities of the input images. In order to further improve the discrimination of features, we concatenate the part-based and body-based features to form the final pedestrian representation. We evaluate the proposed DFBP on two large-scale databases, i.e., Market1501 database and CUHK03 database. The results demonstrate that the proposed DFBP outperforms the state-of-the-art methods.

Highlights

  • The camera networks, as a kind of wireless sensor networks, have received considerable attention due to the potential value for the practical applications [1,2,3,4]

  • One of the most important applications of camera networks is person re-identification which is an issue of searching the same person from one camera sensor across different camera sensors with a probe image

  • The existing approaches focus on two fundamental problems, i.e., feature representation and metric learning

Read more

Summary

Introduction

The camera networks, as a kind of wireless sensor networks, have received considerable attention due to the potential value for the practical applications [1,2,3,4]. The above methods utilize the holistic pedestrian images as the input and learn the body-based features, which discards the local characteristics of pedestrians. We utilize the holistic images to train an identification model for the body-based features. We utilize a convolutional layer to turn the resulting vector fs to a 2-dimensional vector (p1, p2) which represents the prediction probability of the input subregion pair belonging to the same person. We adopt the identification model for learning body-based features as shown in Fig. 2b and utilize the ResNet-50 [21] as the CNN model. We utilize the convolutional layer to turn the resulting vector f into a C dimensional vector where each element represents the prediction probability of the input pedestrian image belonging to one label. We extract the feature for each subregion and aggregate them by weighted adding to obtain the part-based feature:

Method
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call