Identifying uncertainty regions in Support Vector Machines using geometric margin and convex hulls

Calin Voichita,Sorin Draghici,Purvesh Khatri

doi:10.1109/ijcnn.2008.4634269

Abstract

Like most classification techniques, the existing support vector machines (SVM) approaches are challenged to correctly classify their input when the data points are either very close to the decision boundary or very dissimilar from the training data set. In both situations, most classifiers including SVMs will still give a prediction by assigning the test point to one of the classes. However, when a test instance is very close to the decision boundary, the side of the boundary on which the instance lies, and hence the predicted class, will depend in many instances more on the choices of the tuning or training parameters rather than a clear differences in features. Furthermore, if a test instance is substantially different from all instances used during the training, the classical SVM classifiers will still assign it to a class although there is little evidence to support such assignment. In both cases, it is very useful for a classifier to be able to assess its ability to classify a given instance by identifying those regions of the feature space in which the class assignments are less certain. In this paper, we propose two novel approaches based on: i) a geometric uncertainty margin and ii) the convex hulls of the training points in the feature space. Our proposed techniques improve upon the existing SVM-based approaches by adding the ability to identify ldquouncertaintyrdquo areas where the assignment of a test instance to a class cannot be guaranteed. We illustrate both the problems and our novel techniques on the Iris data set from the UCI machine learning repository.

Full Text