Abstract

Given a finite number of stochastic systems, the goal of our problem is to dynamically allocate a finite sampling budget to maximize the probability of selecting the “best” system. Systems are encoded with the probability distributions that govern sample observations, which are unknown and only assumed to belong to a broad family of distributions that need not admit any parametric representation. The best system is defined as the one with the highest quantile value. The objective of maximizing the probability of selecting this best system is not analytically tractable. In lieu of that, we use the rate function for the probability of error relying on large deviations theory. Our point of departure is an algorithm that naively combines sequential estimation and myopic optimization. This algorithm is shown to be asymptotically optimal; however, it exhibits poor finite-time performance and does not lead itself to implementation in settings with a large number of systems. To address this, we propose practically implementable variants that retain the asymptotic performance of the former while dramatically improving its finite-time performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call