Fast methods for training Gaussian processes on large datasets

C J Moore,J R Gair,A J K Chua,C P L Berry

doi:10.1098/rsos.160125

C J Moore, J R Gair + Show 2 more

Open Access

PDF Available

https://doi.org/10.1098/rsos.160125

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Gaussian process regression (GPR) is a non-parametric Bayesian technique for interpolating or fitting data. The main barrier to further uptake of this powerful tool rests in the computational costs associated with the matrices which arise when dealing with large datasets. Here, we derive some simple results which we have found useful for speeding up the learning stage in the GPR algorithm, and especially for performing Bayesian model comparison between different covariance functions. We apply our techniques to both synthetic and real data and quantify the speed-up relative to using nested sampling to numerically evaluate model evidences.

Highlights

A wide range of commonly occurring inference problems can be fruitfully tackled using Bayesian methods
A common inference problem is that of regression; determining the relationship of a control variable x to an output variable y given a set of measurements of {yi} at points {xi}
We present two techniques that speed up the training stage of the Gaussian process regression (GPR) algorithm

Summary

Introduction

A wide range of commonly occurring inference problems can be fruitfully tackled using Bayesian methods. We present modified expressions for the hyperlikelihood, its gradient and its Hessian matrix, which have all been analytically maximized and marginalized over a single-scale hyperparameter. This analytic maximization or marginalization reduces the dimensionality of the subsequent optimization problem and further speeds up the training and comparison of GPs. This analytic maximization or marginalization reduces the dimensionality of the subsequent optimization problem and further speeds up the training and comparison of GPs These techniques are useful when attempting to rapidly fit large, irregularly sampled datasets with a variety of covariance function models.

Gaussian process regression and training

Using the gradient and Hessian

Partial analytic maximization

Numerical results

Synthetic data

Tidal data from Woods Hole

Summary

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Royal Society Open Science	Publication Date: May 1, 2016
Citations: 42	License type: cc-by

R Discovery Prime

Fast methods for training Gaussian processes on large datasets

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Royal Society Open Science

Lead the way for us

Similar Papers

A weighted heteroscedastic Gaussian Process Modelling via particle swarm optimization
Xiaodan Hong ... Biao Huang
Chemometrics and Intelligent Laboratory Systems | VOL. 172
Xiaodan Hong, et. al.Xiaodan Hong ... Biao Huang
08 Dec 2017
Chemometrics and Intelligent Laboratory Systems | VOL. 172

Evaluating enhanced predictive modeling of foam concrete compressive strength using artificial intelligence algorithms
Mohamed Abdellatief ... Ahmed El-Shafie
Materials Today Communications | VOL. 40
Mohamed Abdellatief, et. al.Mohamed Abdellatief ... Ahmed El-Shafie
01 Aug 2024
Materials Today Communications | VOL. 40

Prediction of soil hydraulic properties by Gaussian process regression algorithm in arid and semiarid zones in Iran
M Rastgou ... Andrew S Gregory
Soil and Tillage Research | VOL. 210
M Rastgou, et. al.M Rastgou ... Andrew S Gregory
03 Mar 2021
Soil and Tillage Research | VOL. 210

Weighted bagging gaussion process regression to predict remaining useful life of electro-mechanical actuator
Yujie Zhang ... Datong Liu
-
Yujie Zhang, et. al.Yujie Zhang ... Datong Liu
01 Oct 2016
01 Oct 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Fast methods for training Gaussian processes on large datasets

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Royal Society Open Science