Abstract

This paper proposes a method for solving optimization problems in which the decision-maker cannot evaluate the objective function, but rather can only express a preference such as “this is better than that” between two candidate decision vectors. The algorithm described in this paper aims at reaching the global optimizer by iteratively proposing the decision maker a new comparison to make, based on actively learning a surrogate of the latent (unknown and perhaps unquantifiable) objective function from past sampled decision vectors and pairwise preferences. A radial-basis function surrogate is fit via linear or quadratic programming, satisfying if possible the preferences expressed by the decision maker on existing samples. The surrogate is used to propose a new sample of the decision vector for comparison with the current best candidate based on two possible criteria: minimize a combination of the surrogate and an inverse weighting distance function to balance between exploitation of the surrogate and exploration of the decision space, or maximize a function related to the probability that the new candidate will be preferred. Compared to active preference learning based on Bayesian optimization, we show that our approach is competitive in that, within the same number of comparisons, it usually approaches the global optimum more closely and is computationally lighter. Applications of the proposed algorithm to solve a set of benchmark global optimization problems, for multi-objective optimization, and for optimal tuning of a cost-sensitive neural network classifier for object recognition from images are described in the paper. MATLAB and a Python implementations of the algorithms described in the paper are available at http://cse.lab.imtlucca.it/~bemporad/glis.

Highlights

  • 1.1 Learning and optimization from preferencesTaking an optimal decision is the process of selecting the value of certain variables that produces “best” results

  • In this paper we propose a new approach for optimization based on active preference learning in which the surrogate function is modeled by radial basis functions (RBFs)

  • Our formulation does not require to derive posterior probability distributions, with the advantage that (1) it can be more generalized than preferential Bayesian optimization (PBO), for example additional constraints on the surrogate function can be immediately taken into account in the convex programming problem that might not have a probabilistic interpretation; (2) in particular the RBF+inverse distance weighting (IDW) version of the method is purely deterministic and delivers a similar level of performance with an easier interpretation than PBO; (3) the method does not require approximating posteriors that cannot be computed analytically and may result computationally involved

Read more

Summary

Learning and optimization from preferences

Taking an optimal decision is the process of selecting the value of certain variables that produces “best” results. Finding a global optimizer of f by preference information can be reinterpreted as the problem of looking for the vector x⋆ such that it is preferred to any other vector x Such a preference-based optimization approach requires a solution method that only observes the outcome of the comparison f (x1) ≤ f (x2) , not the values f (x1), f (x2) , not even the value of the difference f (x1) − f (x2). Rather than minimizing the surrogate, which may lead to miss the global optimum of the actual objective function, an acquisition function is minimized instead to generate new candidates The latter function consists of a combination of the surrogate and of an extra term that promotes exploring areas of the decision space that have not been yet sampled. In Bemporad (2020), general radial basis functions (RBFs) are used to construct the surrogate, and inverse distance weighting (IDW) functions to promote exploration of the space of decision variables

Preference‐based optimization of expensive black‐box function
Contribution
Outline
Problem statement
Surrogate function
Self‐calibration of RBF
Acquisition function
Scaling
Preference learning algorithm
Computational complexity
Application to multi‐objective optimization
Numerical results
Illustrative example
Benchmark global optimization problems
Choosing optimal cost‐sensitive classifiers via preferences
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call