Abstract

We consider optimization of batch data processing if there are two alternative processing methods available with different unknown efficiencies. One should determine more efficient method and provide its predominant usage. Formally, the problem is presented as Gaussian two-armed bandit problem with a priori unknown mathematical expectations and variances of incomes. We consider the problem in robust (minimax) setting. According to the main theorem of game theory, minimax strategy and minimax risk are sought for as Bayesian ones corresponding to the worst-case prior distribution of parameter. We describe the properties of the worst-case prior distribution and present corresponding recursive equations for determining Bayesian risk and expected losses. Some numerical examples are presented. We show that the control performance almost does not depend on the number of processed batches if this number is large enough.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call