Multiobjective tree-based reinforcement learning for estimating tolerant dynamic treatment regimes.

Yao Song,Lu Wang

doi:10.1093/biomtc/ujad017

Abstract

A dynamic treatment regime (DTR) is a sequence of treatment decision rules that dictate individualized treatments based on evolving treatment and covariate history. It provides a vehicle for optimizing a clinical decision support system and fits well into the broader paradigm of personalized medicine. However, many real-world problems involve multiple competing priorities, and decision rules differ when trade-offs are present. Correspondingly, there may be more than one feasible decision that leads to empirically sufficient optimization. In this paper, we propose a concept of "tolerant regime," which provides a set of individualized feasible decision rules under a prespecified tolerance rate. A multiobjective tree-based reinforcement learning (MOT-RL) method is developed to directly estimate the tolerant DTR (tDTR) that optimizes multiple objectives in a multistage multitreatment setting. At each stage, MOT-RL constructs an unsupervised decision tree by modeling the counterfactual mean outcome of each objective via semiparametric regression and maximizing a purity measure constructed by the scalarized augmented inverse probability weighted estimators (SAIPWE). The algorithm is implemented in a backward inductive manner through multiple decision stages, and it estimates the optimal DTR and tDTR depending on the decision-maker's preferences. Multiobjective tree-based reinforcement learning is robust, efficient, easy-to-interpret, and flexible to different settings. We apply MOT-RL to evaluate 2-stage chemotherapy regimes that reduce disease burden and prolong survival for advanced prostate cancer patients using a dataset collected at MD Anderson Cancer Center.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multiobjective tree-based reinforcement learning for estimating tolerant dynamic treatment regimes.

Abstract

Talk to us

Similar Papers

More From: Biometrics

Lead the way for us

Journal: Biometrics	Publication Date: Jan 29, 2024
Citations: 1

Similar Papers

Stochastic Tree Search for Estimating Optimal Dynamic Treatment Regimes
Yilun Sun ... Lu Wang
Journal of the American Statistical Association | VOL. 116
Yilun Sun, et. al.Yilun Sun ... Lu Wang
23 Oct 2020
Journal of the American Statistical Association | VOL. 116

TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES.
Yebin Tao ... Daniel Almirall
The Annals of Applied Statistics | VOL. 12
Yebin Tao, et. al.Yebin Tao ... Daniel Almirall
01 Sep 2018
The Annals of Applied Statistics | VOL. 12

Treatment-competing events in dynamic regimes
Brent A Johnson
Lifetime Data Analysis | VOL. 14
Brent A JohnsonBrent A Johnson
09 Sep 2007
Lifetime Data Analysis | VOL. 14

Dynamic treatment regimes: technical challenges and applications.
Eric B Laber ... Min Qian
Electronic Journal of Statistics | VOL. 8
Eric B Laber, et. al.Eric B Laber ... Min Qian
01 Jan 2014
Electronic Journal of Statistics | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multiobjective tree-based reinforcement learning for estimating tolerant dynamic treatment regimes.

Abstract

Talk to us

Similar Papers

More From: Biometrics