Abstract

Abstract A sequential design problem which is also called the ‘two-armed bandit problem’ is considered under the condition that a continuous random variable is obtained from the general one-parameter distribution with probability p and no observation is obtained with probability 1–p. This problem is formulated by the principle of optimality of dynamic programming and some properties of the optimal strategy for this problem is obtained under several conditions. In the case of one arm known, the optimal strategy is derived explicitly by using the critical value function.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call