Abstract

Learning with self-derived targets provides a non-contrastive method for unsupervised image representation learning, where the variety in targets is crucial. Recent work has achieved good performance by learning with targets obtained via cluster-balancing. However, the equal-cluster-size constraint becomes too restrictive for handling data with imbalanced categories or coming in small batches. In this paper, we propose a new clustering-based approach for non-contrastive image representation learning with no need for a particular architecture design or extra memory bank and no explicit constraints on cluster size. A key formulation is to learn embedding consistency and variable decorrelation in the cluster space by tweaking the batch-wise cross-correlation matrix towards an identity one. With this identitization loss incorporated, predicted cluster assignments of two randomly augmented views of the same image serve as targets for each other. We carried out comprehensive experimental studies of linear classification with learned representations of benchmark image datasets. Our results show that the proposed approach significantly outperforms state-of-the-art approaches and is more robust to class imbalance than those with cluster balancing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.