Abstract

Active learning is an important technique to alleviate the problem when there is abundant unlabeled data but scarce labeled data. It aims to choose the most valuable samples to label in order to build powerful predictive models with minimal supervision. However, under the setup of active learning, when the data are characterized by high-dimensional features, it will be difficult to get a reliable estimate on the model parameters, as the labeled data are limited. Most existing works tackle this problem by learning a low-dimensional representation of data before active learning, but it cannot guarantee to obtain promising results, as traditional feature extraction techniques and active learning algorithms are designed independently. In this paper, we propose an efficient hybrid active learning algorithm, called Recursive Maximum Margin Active Learning. We optimize the active learning and semi-supervised feature extraction under a unified framework to tackle the high-dimensional features' problems and select the most representative samples in the low-dimensional space. By introducing the semi-supervised maximum margin criterion into active learning, we can conduct sample selection and feature extraction recursively at each iteration of active learning to learn more accurate models. The extensive experimental results show that the proposed method outperforms several state-of-the-art active learning methods on publicly available datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call