Abstract

We revisit the problem of online linear optimization in the case where the set of feasible actions is accessible through an approximated linear optimization oracle with a factor α multiplicative approximation guarantee. This setting in particular is interesting because it captures natural online extensions of well-studied offline linear optimization problems that are NP-hard yet admit efficient approximation algorithms. The goal here is to minimize the α-regret, which is the natural extension to this setting of the standard regret in online learning. We present new algorithms with significantly improved oracle complexity for both the full-information and bandit variants of the problem. Mainly, for both variants, we present α-regret bounds of [Formula: see text], were T is the number of prediction rounds, using only [Formula: see text] calls to the approximation oracle per iteration, on average. These are the first results to obtain both the average oracle complexity of [Formula: see text] (or even polylogarithmic in T) and α -regret bound [Formula: see text] for a constant c > 0 for both variants.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call