Local learning regularization networks for localized regression

Yiannis Kokkinos,Konstantinos G Margaritis

doi:10.1007/s00521-016-2569-0

Abstract

Local learning algorithms use a neighborhood of training data close to a given testing query point in order to learn the local parameters and create on-the-fly a local model specifically designed for this query point. The local approach delivers breakthrough performance in many application domains. This paper considers local learning versions of regularization networks (RN) and investigates several options for improving their online prediction performance, both in accuracy and speed. First, we exploit the interplay between locally optimized and globally optimized hyper-parameters (regularization parameter and kernel width) each new predictor needs to optimize online. There is a substantial reduction of the operation cost in the case we use two globally optimized hyper-parameters that are common to all local models. We also demonstrate that this global optimization of the two hyper-parameters produces more accurate models than the other cases that locally optimize online either the regularization parameter, or the kernel width, or both. Then by comparing Eigenvalue decomposition (EVD) with Cholesky decomposition specifically for the local learning training and testing phases, we also reveal that the Cholesky-based implementations are faster that their EVD counterparts for all the training cases. While EVD is suitable for validating cost-effectively several regularization parameters, Cholesky should be preferred when validating several neighborhood sizes (the number of k-nearest neighbors) as well as when the local network operates online. Then, we exploit parallelism in a multi-core system for these local computations demonstrating that the execution times are further reduced. Finally, although the use of pre-computed stored local models instead of the online learning local models is even faster, this option deteriorates the performance. Apparently, there is a substantial gain in waiting for a testing point to arrive before building a local model, and hence the online local learning RNs are more accurate than their pre-computed stored local models. To support all these findings, we also present extensive experimental results and comparisons on several benchmark datasets.

Full Text