In this paper we consider the problem of training a Support Vector Machine (SVM) online using a stream of data in random order. We provide a fast online training algorithm for general SVM on very large datasets. Based on the geometric interpretation of SVM known as the polytope distance, our algorithm uses a gradient descent procedure to solve the problem. With high probability our algorithm outputs an [Formula: see text]-approximation result in constant time and space, which is independent of the size of the dataset, where [Formula: see text]-approximation means that the separating margin of the classifier is almost optimal (with error [Formula: see text]), and the number of misclassified training points is very small (with error [Formula: see text]). Experimental results show that our algorithm outperforms most of existing online algorithms, especially in the space requirement aspect, while maintaining high accuracy.
Read full abstract