Structural properties on the boundary and monotonicity of optimal strategies in a two-armed bandit process with delayed responses are explored. Moreover, a finite horizon optimal stopping solution is derived which complements a known result in the infinite horizon case.
Read full abstract