Abstract
In this paper we study the kernel-based online gradient descent with least squares loss without an explicit regularization term. Our approach is novel by controlling the expectation of the K-norm of \begin{document}$ f_t $\end{document} using an iterative process. Then we use distributed learning to improve our result.
Highlights
Different from the classical batch learning which learns from the entire data set, online learning seeks to learn from a data set with an increasing size
The online gradient descent algorithm is defined in the following way: f1 = 0, (2)
In the distributed learning we divide our source of data into J different subsets and we use the online gradient descent algorithm for each subset of data
Summary
Different from the classical batch learning which learns from the entire data set, online learning seeks to learn from a data set with an increasing size. The gradient descent method is a powerful algorithm designed to find the optimal value of a function, and online gradient descent is an adaptation to the online scheme. The online gradient descent algorithm has been studied in [9, 15] recently. In [14], the early stopping approach for batch learning is studied. In [9], the author studied an online gradient descent algorithm with a regularized term λft, which can be formulated as follows: f1 = 0,. We call λ the regularization parameter and when λ > 0, the algorithm is called online regularized learning and it has been well studied in [7, 9, 16].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.