Abstract

Traditional Person Re-identification (ReID) methods mainly focus on cross-camera scenarios, while identifying a person in the same video/camera from adjacent subsequent frames is also an important question, for example, in human tracking and pose tracking. We try to address this unexplored in-video ReID problem with a new large-scale video-based ReID dataset called PoseTrack-ReID with full images available and a new network structure called ReID-Head, which can extract multi-person features efficiently in real time and can be integrated with both one-stage and two-stage human or pose detectors. A new loss function is also required to solve this new in-video problem. Hence, a triplet-based loss function with an online hard example mining designed to distinguish persons in the same video/group is proposed, called instance hard triplet loss, which can be applied in both cross-camera ReID and in-video ReID. Compared with the widely-used batch hard triplet loss, our proposed loss achieves competitive performance and saves more than 30% of the training time. We also propose an automatic reciprocal identity association method, so we can train our model in an unsupervised way, which further extends the potential applications of in-video ReID. The PoseTrack-ReID dataset and code will be publicly released.

Highlights

  • Given a query person image, person re-identification aims to identify persons with the sameIdentity (ID) in the gallery images

  • Even a state-of-the-art model trained on a popular large-scale cross-camera ReID dataset still performed badly on the PoseTrack-ReID dataset, because in-video ReID was a different problem with cross-camera ReID, and we needed to train a new model to fit for the new job, that is why we proposed a new dataset, a new network structure, and a new loss function to answer how to train for the new in-video problem; Unlike ReID-Head, the performance of Part-based Convolutional Baseline (PCB) did not descend with a larger G, because cross-camera ReID is a long-term problem, which discards short-term clues, while in-video

  • We proposed a new large-scale video-based in-video ReID dataset, PoseTrack-ReID, with full images available

Read more

Summary

Introduction

Given a query person image, person re-identification aims to identify persons with the sameIdentity (ID) in the gallery images. Some researchers tried to incorporate ReID with human tracking [3,4,5] They utilized extra ReID datasets to train a ReID model and used the obtained model for feature extraction. Those extracted ReID features were utilized to identify the tracking target from candidate persons, achieving a better performance. Directly using a model, trained on cross-camera ReID datasets such as the Market-1501 [1] dataset and CUHK03 [6] dataset, usually obtains an inferior performance due to the cross-domain bias that the appearance in the source dataset is often much different from the appearance in the target dataset.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call