Abstract

In this paper, we give a new generalization error bound of Multiple Kernel Learning (MKL) for a general class of regularizations, and discuss what kind of regularization gives a favorable predictive accuracy. Our main target in this paper is dense type regularizations including $\ell_{p}$-MKL. According to the numerical experiments, it is known that the sparse regularization does not necessarily show a good performance compared with dense type regularizations. Motivated by this fact, this paper gives a general theoretical tool to derive fast learning rates of MKL that is applicable to arbitrary mixed-norm-type regularizations in a unifying manner. This enables us to compare the generalization performances of various types of regularizations. As a consequence, we observe that the homogeneity of the complexities of candidate reproducing kernel Hilbert spaces (RKHSs) affects which regularization strategy ($\ell_{1}$ or dense) is preferred. In fact, in homogeneous complexity settings where the complexities of all RKHSs are evenly same, $\ell_{1}$-regularization is optimal among all isotropic norms. On the other hand, in inhomogeneous complexity settings, dense type regularizations can show better learning rate than sparse $\ell_{1}$-regularization. We also show that our learning rate achieves the minimax lower bound in homogeneous complexity settings.

Highlights

  • Multiple Kernel Learning (MKL) proposed by Lanckriet et al (2004) is one of the most promising methods that adaptively select the kernel function in supervised kernel learning

  • We have shown a unifying framework to derive the learning rate of MKL with arbitrary mixed-norm-type regularization

  • We have seen that the convergence rate of lp-MKL obtained in homogeneous settings is tighter and requires less restrictive condition than existing results

Read more

Summary

Introduction

Multiple Kernel Learning (MKL) proposed by Lanckriet et al (2004) is one of the most promising methods that adaptively select the kernel function in supervised kernel learning. The goal of this paper is to give a theoretical justification to these experimental results favorable for the dense type MKL methods To this aim, we give a unifying framework to derive a fast learning rate of an arbitrary norm type regularization, and discuss which regularization is preferred depending on the problem settings. Cortes et al (2009b) presented a convergence bound for a learning method with L2 regularization on the kernel weight. Koltchinskii and Yuan (2010) considered a variant of l1-MKL and showed it achieves the minimax optimal convergence rate. All these localized convergence rates were considered in sparse learning settings, and it has not been discussed how a dense type regularization outperforms the sparse l1-regularization. As far as the author knows, this is the first theoretical attempt to clearly show the inhomogeneous complexities are advantageous for dense type MKL

Problem Formulation
Notations and Assumptions
Convergence Rate of ψ-norm MKL
Analysis on Homogeneous Settings
Simplification of Convergence Rate
The dual of the norm
Convergence Rate of lp-MKL
Minimax Lower Bound
Optimal Regularization Strategy
Analysis on Inhomogeneous Settings
Numerical Comparison between Homogeneous and Inhomogeneous Settings
Generalization of loss function
2: The expected generalization error
Conclusion and Future Work
R2 for
Proof of Lemma 5
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.