Abstract

Active learning alleviates labeling costs by selecting and labeling the most informative examples from an unlabeled pool. However, most existing active learning approaches estimate informativeness with uncalibrated confidence, resulting in unreliable informativeness estimation. These approaches generally ignored two significant issues caused by uncalibrated confidence methods. Firstly, the average uncalibrated confidence generated by modern neural networks is usually higher than the accuracy. Secondly, examples located near the decision boundaries are unstable during prediction when the target model updates parameters in the last several epochs, even throughout the training process. This phenomenon, caused by the forgetting characteristic of neural networks, has a significant impact on some specific models that estimate the informativeness by predicted probability vectors or pseudo labels. To address these issues, in this paper, we propose a novel active learning approach to reliably estimate informativeness with calibrated confidence. Specifically, we integrate the intermediate predictions for each unlabeled example, generated by the target model during the training process, to generate calibrated confidence. The calibrated confidence can capture a tendentious label from an indecisive subset of the class space. We show that the calibrated confidence with tendentiousness can maintain the ability of correct predictions. The empirical results demonstrate that our approach outperforms the state-of-the-art active learning methods on image classification tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.