Video-based person re-identification (re-id) has attracted a significant attention in recent years due to the increasing demand of video surveillance. However, existing methods are usually based on the supervised learning, which requires vast labeled identities across cameras and is not suitable for real scenes. Although some unsupervised approaches have been proposed for video re-id, their performance is far from satisfactory. In this article, we propose an unsupervised anchor association learning (UAAL) framework to address the video-based person re-id task, in which the feature representation of each sampled tracklet is regarded as an anchor. Specifically, we first propose an intracamera anchor association learning (IAAL) term that learns the discriminative anchor by utilizing the affiliation relations between an image and the anchors in each camera. Then, the exponential moving average (EMA) strategy is employed to update the anchor and the updated anchors are stored into an anchor memory module. On top of that, a cross-camera anchor association learning (CAAL) term is introduced to mine potential positive anchor pairs across cameras by presenting a cyclic ranking anchor alignment and threshold filtering method. Extensive experiments conducted on two public datasets show the superiority of the proposed method; for example, our method achieves 73.2% for rank-1 accuracy and 60.1% for mean average precision (mAP) score, respectively, on MARS, similarly 89.7% and 87.0% on DukeMTMC-VideoReID.
Read full abstract