Invariant and consistent: Unsupervised representation learning for few-shot visual recognition

Heng Wu,Yifan Zhao,Jia Li

doi:10.1016/j.neucom.2022.11.073

Abstract

Few-shot visual recognition aims to identify novel unseen classes with few labels while learning generalized prior knowledge from base classes. Recent ideas propose to explore this problem in an unsupervised setting, i.e., without any labels in base classes, which reduces the heavy consumption of manual annotations. In this paper, we build upon a self-supervised insight and propose a novel unsupervised learning approach that joints Invariant and Consistent (InCo) representation for the few-shot task. For the invariant representation operation, we present a geometric invariance module to construct the rotation prediction of each instance, which learns the intra-instance variance and improves the feature discrimination. To further build consistency representation of inter-instance, we propose a pairwise consistency module from two contrastive learning aspects: a holistic contrastive learning with historical training queues, and a local contrastive learning for enhancing the representation of current training samples. Moreover, to better facilitate contrastive learning among features, we introduce an asymmetric convolutional architecture to encode high-quality representations. Comprehensive experiments on 4 public benchmarks demonstrate the utility of our approach and the superiority compared to existing approaches.

Full Text