Q-Learning: Flexible Learning About Useful Utilities

Erica E M Moodie,Yue Ru Sun,Nema Dean

doi:10.1007/s12561-013-9103-z

Erica E M Moodie, Yue Ru Sun + Show 1 more

Open Access

https://doi.org/10.1007/s12561-013-9103-z

Copy DOI

Abstract

Dynamic treatment regimes are fast becoming an important part of medicine, with the corresponding change in emphasis from treatment of the disease to treatment of the individual patient. Because of the limited number of trials to evaluate personally tailored treatment sequences, inferring optimal treatment regimes from observational data has increased importance. Q-learning is a popular method for estimating the optimal treatment regime, originally in randomized trials but more recently also in observational data. Previous applications of Q-learning have largely been restricted to continuous utility end-points with linear relationships. This paper is the first attempt at both extending the framework to discrete utilities and implementing the modelling of covariates from linear to more flexible modelling using the generalized additive model (GAM) framework. Simulated data results show that the GAM adapted Q-learning typically outperforms Q-learning with linear models and other frequently-used methods based on propensity scores in terms of coverage and bias/MSE. This represents a promising step toward a more fully general Q-learning approach to estimating optimal dynamic treatment regimes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Q-Learning: Flexible Learning About Useful Utilities

Abstract

Talk to us

Similar Papers

More From: Statistics in Biosciences

Lead the way for us

Journal: Statistics in Biosciences	Publication Date: Sep 12, 2013
Citations: 105

Similar Papers

Response to Reader Reaction
Baqun Zhang ... Marie Davidian
Biometrics | VOL. 71
Baqun Zhang, et. al.Baqun Zhang ... Marie Davidian
29 Oct 2014
Biometrics | VOL. 71

Dynamic Regime Marginal Structural Mean Models for Estimation of Optimal Dynamic Treatment Regimes, Part I: Main Content
Liliana Orellana ... James M Robins
The International Journal of Biostatistics | VOL. 6
Liliana Orellana, et. al.Liliana Orellana ... James M Robins
03 Jan 2010
The International Journal of Biostatistics | VOL. 6

Penalized Spline-Involved Tree-based (PenSIT) Learning for estimating an optimal dynamic treatment regime using observational data.
Kelly A Speth ... Michael R Elliott
Statistical Methods in Medical Research | VOL. 31
Kelly A Speth, et. al.Kelly A Speth ... Michael R Elliott
03 Oct 2022
Statistical Methods in Medical Research | VOL. 31

Accountable survival contrast-learning for optimal dynamic treatment regimes
Taehwa Choi ... Hyunjun Lee
Scientific Reports | VOL. 13
Taehwa Choi, et. al.Taehwa Choi ... Hyunjun Lee
08 Feb 2023
Scientific Reports | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Q-Learning: Flexible Learning About Useful Utilities

Abstract

Talk to us

Similar Papers

More From: Statistics in Biosciences