Abstract

Learning of classification models in practice often relies on human annotation effort in which humans assign class labels to data instances. As this process can be very time-consuming and costly, finding effective ways to reduce the annotation cost becomes critical for building such models. To solve this problem, instead of soliciting instance-based annotation we explore region-based annotation as the feedback. A region is defined as a hyper-cubic subspace of the input feature space and it covers a subpopulation of data instances that fall into this region. Each region is labeled with a number in [0,1] (in binary classification setting), representing a human estimate of the positive (or negative) class proportion in the subpopulation. To learn a classifier from region-based feedback we develop an active learning framework that hierarchically divides the input space into smaller and smaller regions. In each iteration we split the region with the highest potential to improve the classification models. This iterative process allows us to gradually learn more refined classification models from more specific regions with more accurate proportions. Through experiments on numerous datasets we demonstrate that our approach offers a new and promising active learning direction that can outperform existing active learning approaches especially in situations when labeling budget is limited and small.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call