How priors of initial hyperparameters affect Gaussian process regression models

Zexun Chen,Bo Wang

doi:10.1016/j.neucom.2017.10.028

Zexun Chen, Bo Wang

Open Access

https://doi.org/10.1016/j.neucom.2017.10.028

Copy DOI

Journal: Neurocomputing	Publication Date: Nov 1, 2017
Citations: 82	License type: cc-by

Affiliation: University of Leicester

Abstract

The hyperparameters in Gaussian process regression (GPR) model with a specified kernel are often estimated from the data via the maximum marginal likelihood. Due to the non-convexity of marginal likelihood with respect to the hyperparameters, the optimization may not converge to the global maxima. A common approach to tackle this issue is to use multiple starting points randomly selected from a specific prior distribution. As a result the choice of prior distribution may play a vital role in the predictability of this approach. However, there exists little research in the literature to study the impact of the prior distributions on the hyperparameter estimation and the performance of GPR. In this paper, we provide the first empirical study on this problem using simulated and real data experiments. We consider different types of priors for the initial values of hyperparameters for some commonly used kernels and investigate the influence of the priors on the predictability of GPR models. The results reveal that, once a kernel is chosen, different priors for the initial hyperparameters have no significant impact on the performance of GPR prediction, despite that the estimates of the hyperparameters are very different to the true values in some cases.

Highlights

Over the last few decades, Gaussian Processes Regression (GPR) has been proven to be a powerful and effective method for non-linear regression problems due to many desirable properties, such as ease of obtaining and expressing 5 uncertainty in predictions, the ability to capture a wide variety of behaviour through a simple parameterization, and a natural Bayesian interpretation [1]
Various empirical studies have shown that GPR can make better performance for prediction in many areas [5, 6, 7, 8] compared to some other models such as Support Vector Machine (SVM) [9, 10, 11], and a number of further developments of Gaussian process methods have been proposed, including deep Gaussian process [12] and 15 recurrent Gaussian processes [13]
The choice of kernel 20 has a profound impact on the performance of a GPR model, just as activation function, learning rate can affect the result of a neural network [14]

Summary

Introduction

Over the last few decades, Gaussian Processes Regression (GPR) has been proven to be a powerful and effective method for non-linear regression problems due to many desirable properties, such as ease of obtaining and expressing 5 uncertainty in predictions, the ability to capture a wide variety of behaviour through a simple parameterization, and a natural Bayesian interpretation [1]. Most practitioners using GPR as a modelling tool tend to choose a simple prior dis tribution based on their expert opinions and experiences, such as the Uniform distribution in the range of (0, 1) [4, 17, 20] It is of importance and of interest to investigate whether the predictability of GPR models would be jeopardised if the prior distribution were not properly chosen and how the choice of prior distribution may affect the performance of GPR models [19, 20]. We consider different types of priors, including vague and data-dominated, for the initial values of hyperparameters for some commonly used kernels and investigate the influence of the priors on the predictability of GPR models.

Gaussian processes regression model

Kernels

Squared exponential

Estimation of hyperparameters

Sensitivity of prior distributions for initial hyperparameters

Vague priors

Data-dominated priors

Experiments using samples from Gaussian processes

Squared Exponential kernel

Periodic Kernel

Experiments using samples from time series

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

How priors of initial hyperparameters affect Gaussian process regression models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Similar Papers

Performance prognosis of FRCM-to-concrete bond strength using ANFIS-based fuzzy algorithm
Aman Kumar ... Harish Garg
Expert Systems With Applications | VOL. 216
Aman Kumar, et. al.Aman Kumar ... Harish Garg
31 Dec 2022
Expert Systems With Applications | VOL. 216

Modelling the unsaturated hydraulic conductivity of a sandy loam soil using Gaussian process regression
Naji Mordi N Al-Dosary ... Abdulwahed M Aboukarima
Water SA | VOL. 45
Naji Mordi N Al-Dosary, et. al.Naji Mordi N Al-Dosary ... Abdulwahed M Aboukarima
31 Jan 2019
Water SA | VOL. 45

Drought Forecasting Using Gaussian Process Regression (GPR) and Empirical Wavelet Transform (EWT)-GPR in Gua Musang
Muhammad Akram Shaari ... Abdulsamad E Yahya
-
Muhammad Akram Shaari, et. al.Muhammad Akram Shaari ... Abdulsamad E Yahya
02 Nov 2019
02 Nov 2019

Effects of Exploration Weight and Overtuned Kernel Parameters on Gaussian Process-Based Bayesian Optimization Search Performance
Yuto Omae
Mathematics | VOL. 11
Yuto OmaeYuto Omae
11 Jul 2023
Mathematics | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

How priors of initial hyperparameters affect Gaussian process regression models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Neurocomputing