Abstract
Blackwell approachability, regret minimization and calibration are three criteria used to evaluate a strategy (or an algorithm) in sequential decision problems, described as repeated games between a player and Nature. Although they have at first sight not much in common, links between them have been discovered: for instance, both consistent and calibrated strategies can be constructed by following, in some auxiliary game, an approachability strategy.   We gather seminal and recent results, develop and generalize Blackwell's elegant theory in several directions. The final objectives is to show how approachability can be used as a basic powerful tool to exhibit a new class of intuitive algorithms, based on simple geometric properties. In order to be complete, we also prove that approachability can be seen as a byproduct of the very existence of consistent or calibrated strategies.
Highlights
Sequential decision problems can be represented as repeated games between a player and Nature
The player has no external regret if, asymptotically, he could not have gained strictly more if he had known, before the beginning of the game, the empirical distribution of moves of Nature. This notion has notably been refined by Foster & Vohra [23] into internal regret: a player has no internal regret if he has no external regret on the set of stages where he played a specific action, as soon as this set is big enough
Φ-regret can be seen as a consequence of external or internal regret in the finite case, its introduction is more useful in the following compact case
Summary
Sequential decision problems can be represented as repeated games between a player and Nature. Instead of considering some exogenous convex combination of these objectives or optimizing them in a given order (to encompass this framework into the precedent one), Blackwell [9] introduced another concept He considered that some target set is given and the player’s goal is that the average outcome converges to it; on the contrary, Nature tries to push it away. A given closed set is approachable, if the player has a strategy such that the average payoffs remains, after some maybe large stage, arbitrarily closed to this target set, no matter the sequence of moves of Nature. Using some generalized notions of regret and/or calibration, one can construct approachability strategies (in the case of convex sets)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have