Abstract

In recent times, variance reduction technique has greatly enhanced the performance of stochastic gradient algorithms, providing a new framework for finding first-order critical points in nonconvex problems. This work considers for the first time, online kernel learning under nonconvex setting, and we propose a variance reduced stochastic optimization algorithm for learning the function belonging to reproducing kernel Hilbert space (RKHS). In contrast to existing techniques, we use a convex combination of two existing biased and unbiased stochastic estimators which eliminates the need for using double loop structure for variance reduction. It is very well known that non-parametric function approximators provide a structured way to learn nonlinear statistical model. However, their function representational complexity grows with the sample size, and for streaming settings it may grow unbounded. Thus, it is crucial to address the statistical accuracy with finite representational complexity. The proposed non-parametric variance reduced learning algorithm FHSGD keeps the unbounded memory growth in check by using a new compression algorithm based on orthogonal matching pursuit called HKOMP. We present theoretical guarantees of a special case of the FHSGD algorithm, which results in vanilla functional stochastic gradient (FSGD) estimator. For the first time, the accuracy and model complexity tradeoff in non-parametric settings under non-convex scenario is established for the standard FSGD algorithm and is characterized in terms of the number of iterations <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$T$</tex> and the model complexity parameter <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\nu$</tex> . We substantiate the efficiency of our proposed algorithm through numerical results by estimating salinity field of the ocean across different locations in the Gulf of Mexico [1]. Thereby the algorithm plays a key role in interpolating the missing values across various locations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.