Abstract

Skew-Gaussian Processes (SkewGPs) extend the multivariate Unified Skew-Normal distributions over finite dimensional vectors to distribution over functions. SkewGPs are more general and flexible than Gaussian processes, as SkewGPs may also represent asymmetric distributions. In a recent contribution, we showed that SkewGP and probit likelihood are conjugate, which allows us to compute the exact posterior for non-parametric binary classification and preference learning. In this paper, we generalize previous results and we prove that SkewGP is conjugate with both the normal and affine probit likelihood, and more in general, with their product. This allows us to (i) handle classification, preference, numeric and ordinal regression, and mixed problems in a unified framework; (ii) derive closed-form expression for the corresponding posterior distributions. We show empirically that the proposed framework based on SkewGP provides better performance than Gaussian processes in active learning and Bayesian (constrained) optimization. These two tasks are fundamental for design of experiments and in Data Science.

Highlights

  • Gaussian Processes (GPs) are powerful nonparametric distributions over functions

  • In a recent paper (Benavoli et al, 2020), by extending a result derived by Durante (2019) for the parametric case, we showed that: (i) the probit likelihood (2) and the GP are not conjugate, the posterior process can still be computed in closed form and is a Skew Gaussian Process (SkewGP); (ii) SkewGP prior and probit likelihood are conjugate

  • We will compare Laplace Approximation (LP), Expectation Propagation (EP) and SkewGP in two tasks, Bayesian Active Learning and Bayesian Optimization, where a wrong representation of uncertainty can lead to a significant performance degradation

Read more

Summary

Introduction

Gaussian Processes (GPs) are powerful nonparametric distributions over functions. For real-valued outputs, we can combine the GP prior with a Gaussian likelihood and perform exact posterior inference in closed form. We can obtain posterior samples at any test point by exploiting an additive representation for Skew-Normal vectors which decomposes a Skew-Normal vector into a linear combination of a normal vector and a truncated-normal vector By exploiting the closed-form expression for the posterior and predictive distribution, we compute inferences for regression, classification, preference and mixed problems with computation complexity of O(n3) and storage demands of O(n2) , i.e. identical to GP regression.

Different types of observations and likelihood models
Background on the Skew‐Normal distribution and Skew Gaussian Processes
Unified Skew‐Normal distribution
Additive representations
Closure properties
Normal likelihood
Probit affine likelihood
Mixed likelihood
Sampling from the posterior and hyperparameters selection
Application to active learning and optimisation
Bayesian active learning
Bayesian optimisation
Preferential optimisation
Safe Bayesian optimisation
Conclusions
A2 A3 A4
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call