Abstract

Unsupervised active learning has become an active research topic in the machine learning and computer vision communities, whose goal is to choose a subset of representative samples to be labeled in an unsupervised setting. Most of existing approaches rely on shallow linear models by assuming that each sample can be well approximated by the span (i.e., the set of all linear combinations) of the selected samples, and then take these selected samples as the representative ones for manual labeling. However, the data do not necessarily conform to the linear models in many real-world scenarios, and how to model nonlinearity of data often becomes the key point of unsupervised active learning. Moreover, the existing works often aim to well reconstruct the whole dataset, while ignore the important cluster structure, especially for imbalanced data. In this paper, we present a novel deep unsupervised active learning framework. The proposed method can explicitly learn a nonlinear embedding to map each input into a latent space via a deep neural network, and introduce a selection block to select the representative samples in the learnt latent space through a self-supervised learning strategy. In the selection block, we aim to not only preserve the global structure of the data, but also capture the cluster structure of the data in order to well handle the data imbalance issue during sample selection. Meanwhile, we take advantage of the clustering result to provide self-supervised information to guide the above processes. Finally, we attempt to preserve the local structure of the data, such that the data embedding becomes more precise and the model performance can be further improved. Extensive experimental results on several publicly available datasets clearly demonstrate the effectiveness of our method, compared with the state-of-the-arts.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.