Uncertainty-Based Selective Clustering for Active Learning

Sekjin Hwang,Jinwoo Choi,Joonsoo Choi

doi:10.1109/access.2022.3216065

Sekjin Hwang, Jinwoo Choi + Show 1 more

Open Access

https://doi.org/10.1109/access.2022.3216065

Copy DOI

Abstract

Labeling large amount of data is one of important issues in deep learning due to high labeling cost. One method to address this issue is the use of active learning. Active learning selects from a large unlabeled data pool, a set of data that is more informative to training a model for the task at hand. Many active learning approaches use uncertainty-based methods or diversity-based methods. Both have had good results. However, using uncertainty-based methods, there is a risk that sampled data may be redundant, and the use of redundant data can adversely affect the training of the model. Diversity-based methods risk losing some data important for training the model. In this paper, we propose the uncertainty-based Selective Clustering for Active Learning (SCAL), a method of selectively clustering for data with high uncertainty to sample data from each cluster to reduce redundancy. SCAL is expected to extend the area of the decision boundary represented by the sampled data. SCAL achieves a cutting-edge performance for classification tasks on balanced and unbalanced image datasets as well as semantic segmentation tasks.

Full Text