Part feature learning plays a crucial role in achieving fine-grained semantic understanding in unsupervised vehicle re-identification. However, existing approaches directly model part and global features, which can easily lead to severe gradient vanishing issues due to their unequal feature information and unreliable pseudo-labels. To address this problem, in this paper, we propose a triplet contrastive representation learning (TCRL) framework, which leverages cluster features to bridge the part features and global features for unsupervised vehicle re-identification. Specifically, TCRL devises three memory banks to store the instance/cluster features and proposes a proxy contrastive loss (PCL) to make contrastive learning between adjacent memory banks, thus presenting the associations between the part and global features as a transition of the part-cluster and cluster-global associations. Since the cluster memory bank copes with all the vehicle features, it can summarize them into a discriminative feature representation. To deeply exploit the instance/cluster information, TCRL proposes two additional loss functions. For the instance-level feature, a hybrid contrastive loss (HCL) re-defines the sample correlations by approaching the positive instance features and pushing all negative instance features away. For the cluster-level feature, a weighted regularization cluster contrastive loss (WRCCL) refines the pseudo labels by penalizing the mislabeled images according to the instance similarity. Extensive experiments show that TCRL outperforms many state-of-the-art unsupervised vehicle re-identification approaches.
Read full abstract