L2-norm multiple kernel learning and its application to biomedical data fusion

Shi Yu,Johan Ak Suykens,Yves Moreau,Tillmann Falck,Leon-Charles Tranchevent,Anneleen Daemen,Bart De Moor

doi:10.1186/1471-2105-11-309

Abstract

BackgroundThis paper introduces the notion of optimizing different norms in the dual problem of support vector machines with multiple kernels. The selection of norms yields different extensions of multiple kernel learning (MKL) such as L∞, L1, and L2 MKL. In particular, L2 MKL is a novel method that leads to non-sparse optimal kernel coefficients, which is different from the sparse kernel coefficients optimized by the existing L∞ MKL method. In real biomedical applications, L2 MKL may have more advantages over sparse integration method for thoroughly combining complementary information in heterogeneous data sources.ResultsWe provide a theoretical analysis of the relationship between the L2 optimization of kernels in the dual problem with the L2 coefficient regularization in the primal problem. Understanding the dual L2 problem grants a unified view on MKL and enables us to extend the L2 method to a wide range of machine learning problems. We implement L2 MKL for ranking and classification problems and compare its performance with the sparse L∞ and the averaging L1 MKL methods. The experiments are carried out on six real biomedical data sets and two large scale UCI data sets. L2 MKL yields better performance on most of the benchmark data sets. In particular, we propose a novel L2 MKL least squares support vector machine (LSSVM) algorithm, which is shown to be an efficient and promising classifier for large scale data sets processing.ConclusionsThis paper extends the statistical framework of genomic data fusion based on MKL. Allowing non-sparse weights on the data sources is an attractive option in settings where we believe most data sources to be relevant to the problem at hand and want to avoid a "winner-takes-all" effect seen in L∞ MKL, which can be detrimental to the performance in prospective studies. The notion of optimizing L2 kernels can be straightforwardly extended to ranking, classification, regression, and clustering algorithms. To tackle the computational burden of MKL, this paper proposes several novel LSSVM based MKL algorithms. Systematic comparison on real data sets shows that LSSVM MKL has comparable performance as the conventional SVM MKL algorithms. Moreover, large scale numerical experiments indicate that when cast as semi-infinite programming, LSSVM MKL can be solved more efficiently than SVM MKL.AvailabilityThe MATLAB code of algorithms implemented in this paper is downloadable from http://homes.esat.kuleuven.be/~sistawww/bioi/syu/l2lssvm.html.

Highlights

This paper introduces the notion of optimizing different norms in the dual problem of support vector machines with multiple kernels
Text data performs well in the prioritization of known disease genes, does not always work the best for newly discovered genes. This experiment demonstrates that when prioritizing novel prostate cancer relevant genes, the L2 multiple kernel learning (MKL) approach evenly optimized the kernel coefficients to combine heterogeneous genomic sources and its performance was significantly better than the L∞ method
In this paper we propose a new L2 MKL framework as the complement to the existing L∞ MKL method proposed by Lanckriet et al The L2 MKL is characterized by the non-sparse integration of multiple kernels to optimize the objective function of machine learning problems

Summary

Introduction

This paper introduces the notion of optimizing different norms in the dual problem of support vector machines with multiple kernels. Multiple kernel learning (MKL) has been pioneered by Lanckriet et al [4] and Bach et al [5] as an additive extension of single kernel SVM to incorporate multiple kernels in classification It has been applied as a statistical learning framework for genomic data fusion [6] and many other applications [7]. We may expect the performance of such solutions to degrade significantly on actual real-world applications To address this problem, we propose a new kernel fusion scheme by optimizing the L2-norm of multiple kernels. Empirical results show that the L2-norm kernel fusion can lead to a better performance in biomedical data fusion

Methods

Results

Discussion

Conclusion