Abstract

Fully unsupervised person re-identification (ReID) methods aim to learn discriminative features without using labeled ReID data. Because these methods are easily affected by camera discrepancies, similar studies have typically designed optimization methods to enable the model to learn camera-invariant features. However, they often ignore the impact of camera discrepancies on clustering results. Specifically, camera discrepancies will reduce the intra-class camera diversity and promote the generation of noise labels. To solve the above problems, we propose a unified unsupervised learning framework: camera invariant feature learning (CIFL) framework. First, we designed a novel DBSCAN-NN algorithm in the CIFL framework that improves the intra-class camera diversity by forcibly merging samples from different cameras. Then, we designed feature ensemble clustering that improves the accuracy of the pseudo-labels by clustering feature ensembles. In addition, we designed an optimization method for camera discrepancies: stochastic pulled loss. With the stochastic pulled loss, the ReID model is forced to learn camera-invariant features. We verified the effectiveness and generalization of CIFL on four ReID datasets (Market-1501, DukeMTMC-reID, MSMT17 and CUHK03-NP). The experimental results show that CIFL not only outperforms the existing fully unsupervised methods but also is superior to the unsupervised domain adaptation methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call