Abstract
Active learning (AL) aims to select valuable samples for labeling from an unlabeled sample pool to build a training dataset with minimal annotation cost. Traditional methods always require partially and initially labeled samples to start active selection and then query annotations of samples incrementally through several iterations. However, this scheme is not effective in the deep learning scenario. On the one hand, initially labeled sample sets are not always available in the beginning. On the other hand, the performance of the traditional model is usually poor in the early iterations due to limited training feedback. For the first time, we propose a cold-start AL model based on representative (CALR) sampling, which selects valuable samples without the need for an initial labeled set or the iterative feedback of the target models. Experiments on three image classification datasets, CIFAR-10, CIFAR-100 and Caltech-256, showed that CALR achieved a new state-of-the-art performance of AL in cold-start settings. Especially in low annotation budget conditions, our method can achieve up to a 10% performance increase compared to traditional methods. Furthermore, CALR can be combined with warm-start methods to improve the start-up efficiency while further breaking the performance ceiling of AL, which makes CALR have a broader application scenario.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.