Abstract

A linear SVM scales linearly with the size of a dataset, and hence is very desirable as a classifier for large datasets. However, it is not able to classify a dataset having a nonlinear decision boundary between the classes unless the dataset has been transformed by some mapping function so that the decision boundary becomes linear or it is a good approximation to a linear boundary. Often these mapping functions may result in a dataset with very large dimension or even infinite dimension. To avoid the curse of dimensionality, kernel functions are used as mapping functions. However, a kernel SVM has quadratic time complexity, and hence does not scale very well with large datasets. Moreover, the choice of a kernel function and its parameter optimization are arduous tasks. Therefore, a replacement of kernel function with an explicit mapping function is desirable in the case of large datasets. In this paper, we propose a novel co-evolutionary approach to find an explicit mapping function. We use GA to evolve an n-tuple of GP trees as a mapping function, and GP to evolve each individual GP tree. The dataset is then transformed using the found mapping function so that a linear SVM can be used. Besides the fact that the proposed algorithm allows us to use a fast linear SVM, the results also show that the proposed algorithm outperforms the kernel trick and even performs as good as the kernel trick combined with feature selection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call