Abstract
Multi-armed bandit (MAB) problems, typically modeled as Markov decision processes (MDPs), exemplify the learning vs. earning tradeoff. An area that has motivated theoretical research in MAB designs is the study of clinical trials, where the application of such designs has the potential to significantly improve patient outcomes. However, for many practical problems of interest, the state space is intractably large, rendering exact approaches to solving MDPs impractical. In particular, settings that require multiple simultaneous allocations lead to an expanded state and action-outcome space, necessitating the use of approximation approaches. We propose a novel approximation approach that combines the strengths of multiple methods: grid-based state discretization, value function approximation methods, and techniques for a computationally efficient implementation. The hallmark of our approach is the accurate approximation of the value function that combines linear interpolation with bounds on interpolated value and the addition of a learning component to the objective function. Computational analysis on relevant datasets shows that our approach outperforms existing heuristics (e.g. greedy and upper confidence bound family of algorithms) as well as a popular Lagrangian-based approximation method, where we find that the average regret improves by up to 58.3%. A retrospective implementation on a recently conducted phase 3 clinical trial shows that our design could have reduced the number of failures by 17% relative to the randomized control design used in that trial. Our proposed approach makes it practically feasible for trial administrators and regulators to implement Bayesian response-adaptive designs on large clinical trials with potential significant gains.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.