Reflections on a New Approach to Gittins Indexation

K D Glazebrook,R Garbe

doi:10.1057/palgrave.jors.0471010

Abstract

Traditional approaches to stochastic resource allocation problems (including the classical multi-armed bandit problems) have usually made use of dynamic programming (DP) methodology, perhaps buttressed by further ad hoc arguments. While such approaches seem ‘natural’ they have usually proved technically very difficult. Bertsimas and Niño-Mora have recently given a radically new account of many important results in this area which relate to Gittins indices. The key to their approach is in the characterisation of the region of achievable performance. The optimisation problems of interest are then solved as linear programs over this region. Here we exploit elements within the Bertsimas and Niño-Mora framework (in particular, its capacity to give formulae for the total return of a given policy in closed form) to obtain (i) a simple dynamic programming proof of the optimality of Gittins index policies and (ii) a range of index-based suboptimality bounds for general policies for a variety of stochastic models for resource allocation.

Full Text