Abstract

Many real world classification problems lack of a large number of labeled data for learning an effective classifier. Active learning methods seek to address this problem by reducing the number of labeled instances needed to build an effective classifier. Most current active learning methods, however, are myopic, i.e. select one single unlabelled sample to label at a time. Obviously, such a strategy is neither efficient nor optimal. Non-myopic active learning is hence preferred. Current non-myopic active learning methods are typically greedy by selecting top N unlabeled samples with maximum score. While efficient, such a greedy active learning approach cannot guarantee the learner's performance. In this paper, we introduce a near-optimal non-myopic active learning algorithm that is efficient and simultaneously has a performance guarantee. Based on an expected error reduction objective function, our algorithm efficiently selects a set of samples at each iteration for labeling. By exploiting the submodular property of the objective function, the selected samples are guaranteed to be optimal or near optimal. Our experimental results on UCI data sets and a real-world application show that the proposed algorithm outperforms the myopic active learning method and the existing non-myopic active learning methods in both efficiency and accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call