Abstract
The connection between optimal stopping times of American Options and multi-armed bandits is the subject of active research. This article investigates the effects of optional stopping in a particular class of multi-armed bandit experiments, which randomly allocates observations to arms proportional to the Bayesian posterior probability that each arm is optimal (Thompson sampling). The interplay between optional stopping and prior mismatch is examined. We propose a novel partitioning of regret into peri/post testing. We further show a strong dependence of the parameters of interest on the assumed prior probability density.
Highlights
Sequential testing procedures allow an experiment to stop early, once the available/streaming data collected is sufficient to make a conclusion
This paper expands upon prior work (Loecher, 2017) and focuses on the effects of optional stopping on the multi-armed bandit (MAB) procedure outlined by (Scott, 2010; Scott, 2012)
Our work touches upon previous research on American Options (Bank and Föllmer, 2003) that demonstrates a close connection between multi-armed bandits and optimal stopping times in terms of the Snell envelope of the given payoff process
Summary
Sequential testing procedures allow an experiment to stop early, once the available/streaming data collected is sufficient to make a conclusion. In order to reduce the dependence on the alternate hypothesis, Kulldorff et al (Kulldorff et al, 2011) introduced the use of a maximized sequential probability ratio test (MaxSPRT), where the alternative hypothesis is composite rather than simple. This paper expands upon prior work (Loecher, 2017) and focuses on the effects of optional stopping on the multi-armed bandit (MAB) procedure outlined by (Scott, 2010; Scott, 2012). Our work touches upon previous research on American Options (Bank and Föllmer, 2003) that demonstrates a close connection between multi-armed bandits and optimal stopping times in terms of the Snell envelope of the given payoff process
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.