Abstract

The hyperparameters in Gaussian process regression (GPR) model with a specified kernel are often estimated from the data via the maximum marginal likelihood. Due to the non-convexity of marginal likelihood with respect to the hyperparameters, the optimization may not converge to the global maxima. A common approach to tackle this issue is to use multiple starting points randomly selected from a specific prior distribution. As a result the choice of prior distribution may play a vital role in the predictability of this approach. However, there exists little research in the literature to study the impact of the prior distributions on the hyperparameter estimation and the performance of GPR. In this paper, we provide the first empirical study on this problem using simulated and real data experiments. We consider different types of priors for the initial values of hyperparameters for some commonly used kernels and investigate the influence of the priors on the predictability of GPR models. The results reveal that, once a kernel is chosen, different priors for the initial hyperparameters have no significant impact on the performance of GPR prediction, despite that the estimates of the hyperparameters are very different to the true values in some cases.

Highlights

  • Over the last few decades, Gaussian Processes Regression (GPR) has been proven to be a powerful and effective method for non-linear regression problems due to many desirable properties, such as ease of obtaining and expressing 5 uncertainty in predictions, the ability to capture a wide variety of behaviour through a simple parameterization, and a natural Bayesian interpretation [1]

  • Various empirical studies have shown that GPR can make better performance for prediction in many areas [5, 6, 7, 8] compared to some other models such as Support Vector Machine (SVM) [9, 10, 11], and a number of further developments of Gaussian process methods have been proposed, including deep Gaussian process [12] and 15 recurrent Gaussian processes [13]

  • The choice of kernel 20 has a profound impact on the performance of a GPR model, just as activation function, learning rate can affect the result of a neural network [14]

Read more

Summary

Introduction

Over the last few decades, Gaussian Processes Regression (GPR) has been proven to be a powerful and effective method for non-linear regression problems due to many desirable properties, such as ease of obtaining and expressing 5 uncertainty in predictions, the ability to capture a wide variety of behaviour through a simple parameterization, and a natural Bayesian interpretation [1]. Most practitioners using GPR as a modelling tool tend to choose a simple prior dis tribution based on their expert opinions and experiences, such as the Uniform distribution in the range of (0, 1) [4, 17, 20] It is of importance and of interest to investigate whether the predictability of GPR models would be jeopardised if the prior distribution were not properly chosen and how the choice of prior distribution may affect the performance of GPR models [19, 20]. We consider different types of priors, including vague and data-dominated, for the initial values of hyperparameters for some commonly used kernels and investigate the influence of the priors on the predictability of GPR models.

Gaussian processes regression model
Kernels
Squared exponential
Estimation of hyperparameters
Sensitivity of prior distributions for initial hyperparameters
Vague priors
Data-dominated priors
Experiments using samples from Gaussian processes
Squared Exponential kernel
Periodic Kernel
Experiments using samples from time series
Findings
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.