Abstract
We consider a general regularised interpolation problem for learning a parameter vector from data. The well-known representer theorem says that under certain conditions on the regulariser there exists a solution in the linear span of the data points. This is at the core of kernel methods in machine learning as it makes the problem computationally tractable. Most literature deals only with sufficient conditions for representer theorems in Hilbert spaces and shows that the regulariser being norm-based is sufficient for the existence of a representer theorem. We prove necessary and sufficient conditions for the existence of representer theorems in reflexive Banach spaces and show that any regulariser has to be essentially norm-based for a representer theorem to exist. Moreover, we illustrate why in a sense reflexivity is the minimal requirement on the function space. We further show that if the learning relies on the linear representer theorem, then the solution is independent of the regulariser and in fact determined by the function space alone. This in particular shows the value of generalising Hilbert space learning theory to Banach spaces.
Highlights
It is a common approach in learning theory to formulate a problem of estimating functions from input and output data as an optimisation problem
One contribution of this work is to clarify this view as we show that if the learning relies on the linear representer theorem the solution is independent of the regulariser and it is the function space we chose to work in which determines the solution
This section is based on the work of [21, 22] on reproducing kernel Banach spaces so we will first be presenting the relevant definitions and results on the construction of RKBS from [22]
Summary
Adv Comput Math (2021) 47: 54 used is regularisation, in particular Tikhonov regularisation where we consider an optimisation problem of the form min E(( f, xi H , yi )mi=1) + λΩ(f ) : f ∈ H where H is a Hilbert space with inner product ·, · H, {(xi, yi) : i ∈ Nm} ⊂ H×Y is a set of given input/output data with Y ⊆ R, E : Rm × Y m → R is an error function, Ω : H → R a regulariser and λ > 0 is a regularisation parameter. Crucial for the success of regularisation methods in Hilbert spaces is the well known representer theorem which states that for certain regularisers there is always a solution in the linear span of the data points [6, 9, 16, 18] This means that the problem reduces to finding a function in a finite dimensional subspace of the original function space which is often infinite dimensional. An important consequence of our characterisation of regularisers which admit a linear representer theorem is that one can prove that the solution does not depend on the regulariser but only on the space the optimisation problem is stated in We prove the other main result of the paper, which states that, if we rely on the linear representer
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have