Response Surface Bandits

Josep Ginebra,Murray K Clayton

doi:10.1111/j.2517-6161.1995.tb02062.x

Abstract

SUMMARY In this paper we define a response surface bandit as the sequential design problem that maximizes an expected bandit utility but where the outcomes yn are continuous and can be related through a response surface to a set of controllable variables xn = (x 1n, x 2n,···, xkn). We link this problem to other traditional optimization problems from industrial engineering and to the traditional bandit problem. We consider two approaches to the problem. The first is based on a myopic sequential design. The second approach uses the best design out of a family of designs related to upper bounds for the predicted surface; the family includes myopic and sequential versions of D-optimal designs. These approaches can be generalized to more broadly defined sequential problems.

Full Text