Abstract
Traditional Person Re-identification (ReID) methods mainly focus on cross-camera scenarios, while identifying a person in the same video/camera from adjacent subsequent frames is also an important question, for example, in human tracking and pose tracking. We try to address this unexplored in-video ReID problem with a new large-scale video-based ReID dataset called PoseTrack-ReID with full images available and a new network structure called ReID-Head, which can extract multi-person features efficiently in real time and can be integrated with both one-stage and two-stage human or pose detectors. A new loss function is also required to solve this new in-video problem. Hence, a triplet-based loss function with an online hard example mining designed to distinguish persons in the same video/group is proposed, called instance hard triplet loss, which can be applied in both cross-camera ReID and in-video ReID. Compared with the widely-used batch hard triplet loss, our proposed loss achieves competitive performance and saves more than 30% of the training time. We also propose an automatic reciprocal identity association method, so we can train our model in an unsupervised way, which further extends the potential applications of in-video ReID. The PoseTrack-ReID dataset and code will be publicly released.
Highlights
Given a query person image, person re-identification aims to identify persons with the sameIdentity (ID) in the gallery images
Even a state-of-the-art model trained on a popular large-scale cross-camera ReID dataset still performed badly on the PoseTrack-ReID dataset, because in-video ReID was a different problem with cross-camera ReID, and we needed to train a new model to fit for the new job, that is why we proposed a new dataset, a new network structure, and a new loss function to answer how to train for the new in-video problem; Unlike ReID-Head, the performance of Part-based Convolutional Baseline (PCB) did not descend with a larger G, because cross-camera ReID is a long-term problem, which discards short-term clues, while in-video
We proposed a new large-scale video-based in-video ReID dataset, PoseTrack-ReID, with full images available
Summary
Given a query person image, person re-identification aims to identify persons with the sameIdentity (ID) in the gallery images. Some researchers tried to incorporate ReID with human tracking [3,4,5] They utilized extra ReID datasets to train a ReID model and used the obtained model for feature extraction. Those extracted ReID features were utilized to identify the tracking target from candidate persons, achieving a better performance. Directly using a model, trained on cross-camera ReID datasets such as the Market-1501 [1] dataset and CUHK03 [6] dataset, usually obtains an inferior performance due to the cross-domain bias that the appearance in the source dataset is often much different from the appearance in the target dataset.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have