Abstract

Sparse, knot‐based Gaussian processes have enjoyed considerable success as scalable approximations of full Gaussian processes. Certain sparse models can be derived through specific variational approximations to the true posterior, and knots can be selected to minimize the Kullback‐Leibler divergence between the approximate and true posterior. While this has been a successful approach, simultaneous optimization of knots can be slow due to the number of parameters being optimized. Furthermore, there have been few proposed methods for selecting the number of knots, and no experimental results exist in the literature. We propose using a one‐at‐a‐time knot selection algorithm based on Bayesian optimization to select the number and locations of knots. We showcase the competitive performance of this method relative to optimization of knots simultaneously on three benchmark datasets, but at a fraction of the computational cost.

Highlights

  • Gaussian processes (GPs) are a class of Bayesian nonparametric models with a plethora of uses, such as nonparametric regression and classification, spatial and time series modeling, density estimation, and numerical optimization and integration

  • We extend the use of the novel OAT knot selection algorithm in ref. [8] to the context of nonparametric regression and variational inference

  • We provide experimental results on three real datasets showing the competitive accuracy of models selected using the OAT algorithm compared to those chosen via simultaneous optimization, but often at a lower computational cost

Read more

Summary

INTRODUCTION

Gaussian processes (GPs) are a class of Bayesian nonparametric models with a plethora of uses, such as nonparametric regression and classification, spatial and time series modeling, density estimation, and numerical optimization and integration. The two most common objective functions are the marginal likelihood (or an approximation of it) [21,15,4,12] and the evidence lower bound in case a variational inference approach is taken [22,4,11] While this is often successful in practice, it requires the user to choose the number of knots, K, up front. Reference [8] proposed an efficient one-at-a-time (OAT) knot selection algorithm based on Bayesian optimization to select the number and locations of knots in sparse GPs when the objective function is the marginal likelihood.

GP REGRESSION
Deterministic inducing conditional
Deterministic training conditional
Fully independent conditional
Fully independent training conditional
VA RIATIONAL INFERENCE
Variational inference in sparse GPs
Knot selection using the ELBO
EXPERIMENTS
Boston housing data
Airfoil data
Combined cycle power plant data
Findings
DISCUSSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.