Abstract

We consider sequential decision making under uncertainty, the optimization over large decision space with noisy comparative feedback. This problem can be formulated as a K-armed Dueling Bandits problem where K is the total number of decisions. When K is very large, existing dueling bandits algorithms suffer huge cumulative regret before converging on the optimal arm. This paper studies the dueling bandits problem with a large number of dependent arms. Our problem is motivated by a clinical decision making process in large decision space. We propose an efficient algorithm CorrDuel for the problem which makes decisions to simultaneously deliver effective therapy and explore the decision space. Many sequential decision making problems with large and structured decision space could be facilitated by our algorithm. After evaluated the fast convergence of CorrDuel in analysis and simulation experiments, we applied it on a live clinical trial of therapeutic spinal cord stimulation. It is the first applied algorithm towards spinal cord injury treatments and experimental results show the effectiveness and efficiency of our algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call