Abstract

Multi-label classification is a generalization of conventional classification, where it is possible for a single data point to have multiple labels. Manual annotation of a multi-label data point requires a human oracle to consider the presence/absence of every possible class separately, which involves significant labor. Active learning techniques are effective in reducing human labeling effort to induce a classification model. When exposed to large quantities of unlabeled data, such algorithms automatically select the salient and representative instances for manual annotation. Further, to address the high redundancy in data such as image or video sequences as well as the availability of multiple labeling agents, there have been recent attempts towards a batch mode form of active learning, where a batch of data points is selected simultaneously from an unlabeled set. In this work, we propose a novel optimization based batch mode active learning strategy to minimize human labeling effort in multi-label classification problems. To the best of our knowledge, this is the first attempt to develop such a scheme primarily intended for the multi-label context. The proposed framework is computationally simple, easy to implement and can be suitably modified to perform batch mode active learning in other formulations, such as single-label classification or problems involving hierarchical label spaces. Our results corroborate the efficacy of the proposed algorithm and certify the potential of the framework in being used for real world applications.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.