Composition of Kernel and Acquisition Functions for High Dimensional Bayesian Optimization

Antonio Candelieri,Francesco Archetti,Ilaria Giordani,Riccardo Perego

doi:10.1007/978-3-030-53552-0_29

Abstract

Bayesian Optimization has become the reference method for the global optimization of black box, expensive and possibly noisy functions. Bayesian Optimization learns a probabilistic model about the objective function, usually a Gaussian Process, and builds, depending on its mean and variance, an acquisition function whose optimizer yields the new evaluation point, leading to update the probabilistic surrogate model. Despite its sample efficiency, Bayesian Optimization does not scale well with the dimensions of the problem. Moreover, the optimization of the acquisition function has received less attention because its computational cost is usually considered negligible compared to that of the evaluation of the objective function: its efficient optimization is also inhibited, particularly in high dimensional problems, by multiple extrema and “flat” regions. In this paper we leverage the additivity – aka separability – of the objective function into mapping both the kernel and the acquisition function of the Bayesian Optimization in lower dimensional subspaces. This approach makes more efficient both the learning/updating of the probabilistic surrogate model and the optimization of the acquisition function. Experimental results are presented for a standard test function and a real-life application.

Full Text