Abstract

Support vector machines (SVMs) have met with significant success in numerous real-world learning tasks. However, like most machine learning algorithms, SVMs is a supervised learning which is based on the assumption that it is straightforward to obtain labeled data, but in reality labeled data can be scarce or expensive to obtain. Active learning (AL) is a way to deal with the above problem by asking for the labels of the most “informative” data points. To reduce the amount of human labeling effort while maintaining the SVMs performance, in this work we propose an uncertainty sampling-based active learning approach for SVMs to annotate the most uncertain unlabeled instances, i.e., an algorithm to select the most informative instances for SVMs learning process. During the SVMs active learning process, to reduce annotation effort while maintaining the SVMs classification performance, we firstly employ the decision margin of SVMs output as the initial uncertainty measure to select the most uncertain instances, to further reduce the number of unlabeled instances to be annotated, we employ the Ratio of Center-Distance to select the boundary vectors of SVMs. We provide a theoretical motivation for the algorithm. To verify the effectiveness and efficiency, we have applied the proposed method on several standard UCI datasets. The experimental results show that employing our active learning method can significantly reduce learning cost while achieving the desired performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call