Abstract

Kernel-based models have a strong generalization ability, but most, including SVM, are vulnerable to the curse of kernelization. Moreover, their predictive performance is sensitive to hyperparameter tuning, which demands high computational resources. These problems render kernel methods problematic when dealing with large-scale datasets. To this end, we first formulate the optimization problem in a kernel-based learning setting as a posterior inference problem, and then develop a rich family of Recurrent Neural Network-based variational inference techniques. Unlike existing literature, which stops at the variational distribution and uses it as the surrogate for the true posterior distribution, here we further leverage Stein Variational Gradient Descent to further bring the variational distribution closer to the true posterior, we refer to this step as <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Stein Refinement</i> . Putting these altogether, we arrive at a robust and efficient variational learning method for multiclass kernel machines with extremely accurate approximation. Moreover, our formulation enables efficient learning of kernel parameters and hyperparameters which robustifies the proposed method against data uncertainties. The extensive experiments show that without tuning any parameter on modest quantities of data our method obtains comparable accuracy to LIBSVM, a well-known implementation of SVM, and outperforms other baselines, while being able to seamlessly scale with large-scale datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call