A primal method for multiple kernel learning

Zhifeng Hao,Zijie Chen,Xiaowei Yang,Ganzhao Yuan

doi:10.1007/s00521-012-1022-2

Abstract

The canonical support vector machines (SVMs) are based on a single kernel, recent publications have shown that using multiple kernels instead of a single one can enhance interpretability of the decision function and promote classification accuracy. However, most of existing approaches mainly reformulate the multiple kernel learning as a saddle point optimization problem which concentrates on solving the dual. In this paper, we show that the multiple kernel learning (MKL) problem can be reformulated as a BiConvex optimization and can also be solved in the primal. While the saddle point method still lacks convergence results, our proposed method exhibits strong optimization convergence properties. To solve the MKL problem, a two-stage algorithm that optimizes canonical SVMs and kernel weights alternately is proposed. Since standard Newton and gradient methods are too time-consuming, we employ the truncated-Newton method to optimize the canonical SVMs. The Hessian matrix need not be stored explicitly, and the Newton direction can be computed using several Preconditioned Conjugate Gradient steps on the Hessian operator equation, the algorithm is shown more efficient than the current primal approaches in this MKL setting. Furthermore, we use the Nesterov’s optimal gradient method to optimize the kernel weights. One remarkable advantage of solving in the primal is that it achieves much faster convergence rate than solving in the dual and does not require a two-stage algorithm even for the single kernel LapSVM. Introducing the Laplacian regularizer, we also extend our primal method to semi-supervised scenario. Extensive experiments on some UCI benchmarks have shown that the proposed algorithm converges rapidly and achieves competitive accuracy.

Full Text