Abstract

Kernel trick is widely applied to Support Vector Machine (SVM) to deal with linearly inseparable data which is known as kernel SVM. However, kernel SVM always has high computational cost in practice which makes it unsuitable to handle large scale data. Moreover, kernel SVM always brings hyper-parameters, e.g. bandwidth in Gaussian kernel. Since the hyper-parameters have a significant influence on the final performance of kernel SVM and are pretty hard to tune especially for large scale data, one may need to put lots of effort into finding good enough parameters, and improper settings of the hyper-parameters often make the classification performance even lower than that of linear SVM. Inspired by recent progresses on linear SVM for dealing with large scale data, we propose a well-designed classifier to efficiently handle large scale linearly inseparable data, i.e., Decision Tree SVM (DTSVM). DTSVM has much lower computational cost compared with kernel SVM, and it brings almost no hyper-parameters except a few thresholds which can be fixed in practice. Comprehensive experiments on large scale datasets demonstrate the superiority of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call