In contrastive self-supervised learning, positive samples are typically drawn from the same image but in different augmented views, resulting in a limited source of positive samples. An effective way to alleviate this problem is to incorporate the relationship between samples, which involves including the top-K nearest neighbors of positive samples. However, the issue of false neighbors (i.e., neighbors that do not belong to the same category as the positive sample) is a significant yet often overlooked challenge. In this paper, we present a simple self-supervised learning framework called Mixed Nearest-Neighbors (MNN) for Self-Supervised Learning. The primary objective of the MNN is to enhance the robustness and accuracy of the model by strategically incorporating synthesized neighboring samples. Specifically, our proposed method utilizes image mixture to mitigate the effects of noise introduced by false neighbors, while adopting an intuitive weighting strategy that effectively integrates these synthesized neighbors into the model. The linear evaluation of MNN on CIFAR-10, CIFAR,100, STL-10, and Tiny ImageNet shows an improvement of 0.41% to 1.96% over the previous state-of-the-art (SOTA) methods and 1.82% to 8.8% over the Mean Shift (MSF) method. Furthermore, the proposed components can be seamlessly integrated into other self-supervised learning algorithms to enhance their performance. Code is available at https://github.com/pc-cp/MNN.
Read full abstract