Abstract

Self-supervised contrastive learning offers a means of learning informative features from a pool of unlabeled data. In this paper, we investigate another useful approach. We propose an entirely unlabeled coreset selection method. In this regard, contrastive learning, one of several self-supervised methods, was recently proposed and has consistently delivered the highest performance. This prompted us to choose two leading methods for contrastive learning: the simple framework for contrastive learning of visual representations (SimCLR) and the momentum contrastive (MoCo) learning framework. We calculated the cosine similarities for each example of an epoch for the entire duration of the contrastive learning process and subsequently accumulated the cosine similarity values to obtain the coreset score. Our assumption was that a sample with low similarity would likely behave as a coreset. Compared with existing coreset selection methods with labels, our approach reduced the cost associated with human annotation. In this study, the unsupervised method implemented for coreset selection achieved improvements of 1.25% (for CIFAR10), 0.82% (for SVHN), and 0.19% (for QMNIST) over a randomly selected subset with a size of 30%. Furthermore, our results are comparable to those of the existing supervised coreset selection methods. The differences between the proposed and the above mentioned supervised coreset selection method (forgetting events) were 0.81% on the CIFAR10 dataset, −2.08% on the SVHN dataset (the proposed method outperformed the existing method), and 0.01% on the QMNIST dataset at a subset size of 30%. In addition, our proposed approach exhibited robustness even if the coreset selection model and target model were not identical (e.g., using ResNet18 as a selection model and ResNet101 as the target model). Lastly, we obtained more concrete proof that our coreset examples are highly informative by showing the performance gap between the coreset and non-coreset samples in the coreset cross test experiment. We observed a pair of performance ((testing: non-coreset, training: coreset), (testing: coreset, training: non-coreset)), i.e. (94.27%, 67.39 %) for CIFAR10, (98.24%, 83.30%) for SVHN, and (99.89%, 93.07%) for QMNIST with a subset size of 30%.

Highlights

  • Deep learning-based methods have been highly effective in performing computer vision tasks such as image classification [1], object detection [2], and semantic segmentation [3]

  • The results of our study demonstrate that contrastive learning can extend to unsupervised coreset selection

  • We plotted the distribution of the average cossim for each dataset, as shown in Fig. 9, It is evident that relatively more informative datasets are in the order of CIFAR10, SVHN, and QMNIST because these datasets are composed of RGB + objects, RGB + digits, and gray scale + digits, respectively

Read more

Summary

Introduction

Deep learning-based methods have been highly effective in performing computer vision tasks such as image classification [1], object detection [2], and semantic segmentation [3]. These methods generally require large amounts of data to produce accurate results; in particular, human annotation, as an essential part of supervised learning, can. In other words, when building a new training dataset for deep learning, we should consider the following constraints: i) huge annotation costs (we cannot afford to annotate all given a huge number of unlabeled data), ii) limited storage, iii) limited computation power These limitations increase linearly with the number of examples. Traditional methods mainly perform random selection, which is likely to miss the most informative

Objectives
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call