Abstract

Blackwell approachability, regret minimization and calibration are three criteria used to evaluate a strategy (or an algorithm) in sequential decision problems, described as repeated games between a player and Nature. Although they have at first sight not much in common, links between them have been discovered: for instance, both consistent and calibrated strategies can be constructed by following, in some auxiliary game, an approachability strategy. &nbsp We gather seminal and recent results, develop and generalize Blackwell's elegant theory in several directions. The final objectives is to show how approachability can be used as a basic powerful tool to exhibit a new class of intuitive algorithms, based on simple geometric properties. In order to be complete, we also prove that approachability can be seen as a byproduct of the very existence of consistent or calibrated strategies.

Highlights

  • Sequential decision problems can be represented as repeated games between a player and Nature

  • The player has no external regret if, asymptotically, he could not have gained strictly more if he had known, before the beginning of the game, the empirical distribution of moves of Nature. This notion has notably been refined by Foster & Vohra [23] into internal regret: a player has no internal regret if he has no external regret on the set of stages where he played a specific action, as soon as this set is big enough

  • Φ-regret can be seen as a consequence of external or internal regret in the finite case, its introduction is more useful in the following compact case

Read more

Summary

Introduction

Sequential decision problems can be represented as repeated games between a player and Nature. Instead of considering some exogenous convex combination of these objectives or optimizing them in a given order (to encompass this framework into the precedent one), Blackwell [9] introduced another concept He considered that some target set is given and the player’s goal is that the average outcome converges to it; on the contrary, Nature tries to push it away. A given closed set is approachable, if the player has a strategy such that the average payoffs remains, after some maybe large stage, arbitrarily closed to this target set, no matter the sequence of moves of Nature. Using some generalized notions of regret and/or calibration, one can construct approachability strategies (in the case of convex sets)

Approachability of arbitrary sets
Approachable arbitrary set : Blackwell’s sufficient condition
Equivalent formulations and necessary condition
Specific case of convex sets
Sharper high probability bounds
Biased approachability
Deterministic approachability and procedures in law
Approachability in infinite dimension spaces
Approachability with activation
Variable stage duration
Unbounded payoffs and strong law of large numbers
Bounded memory
Approachability in continuous time
Information-based strategies
Potential-based and uniform-norm approachability
From weak approachability to approachability
Regret minimization
Finite action spaces
Internal and Φ-regret
Reductions : form external to Φ-regret
Compact action spaces
Generalizations
Experts
Regret and sets of equilibria
Calibration
Discussion on the impossibility of deterministic calibration
Efficient calibration in the binary case
Generalization
Smooth calibration
Using approachability to get regret
Using regret to get calibration
Using calibration to get regret and approachability
Using regret to get approachability
Game Theory lemma
Uniform concentration inequalities
Findings
Probability lemmas
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call