Q‐learning for estimating optimal dynamic treatment rules from observational data

Erica E M Moodie,Michael S Kramer,Bibhas Chakraborty

doi:10.1002/cjs.11162

Abstract

The area of dynamic treatment regimes (DTR) aims to make inference about adaptive, multistage decision-making in clinical practice. A DTR is a set of decision rules, one per interval of treatment, where each decision is a function of treatment and covariate history that returns a recommended treatment. Q-learning is a popular method from the reinforcement learning literature that has recently been applied to estimate DTRs. While, in principle, Q-learning can be used for both randomized and observational data, the focus in the literature thus far has been exclusively on the randomized treatment setting. We extend the method to incorporate measured confounding covariates, using direct adjustment and a variety of propensity score approaches. The methods are examined under various settings including non-regular scenarios. We illustrate the methods in examining the effect of breastfeeding on vocabulary testing, based on data from the Promotion of Breastfeeding Intervention Trial.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Q‐learning for estimating optimal dynamic treatment rules from observational data

Abstract

Talk to us

Similar Papers

More From: Canadian Journal of Statistics

Lead the way for us

Journal: Canadian Journal of Statistics	Publication Date: Nov 7, 2012
Citations: 73

Similar Papers

Estimation of optimal dynamic treatment regimes.
Ying-Qi Zhao ... Eric B Laber
Clinical trials (London, England) | VOL. 11
Ying-Qi Zhao, et. al.Ying-Qi Zhao ... Eric B Laber
28 May 2014
Clinical trials (London, England) | VOL. 11

Commentary
Michael S Kramer ... Erica E M Moodie
Epidemiology | VOL. 23
Michael S Kramer, et. al.Michael S Kramer ... Erica E M Moodie
01 Nov 2012
Epidemiology | VOL. 23

Penalized Spline-Involved Tree-based (PenSIT) Learning for estimating an optimal dynamic treatment regime using observational data.
Kelly A Speth ... Michael R Elliott
Statistical Methods in Medical Research | VOL. 31
Kelly A Speth, et. al.Kelly A Speth ... Michael R Elliott
03 Oct 2022
Statistical Methods in Medical Research | VOL. 31

Estimating and improving dynamic treatment regimes with a time-varying instrumental variable
Shuxiao Chen ... Bo Zhang
Journal of the Royal Statistical Society Series B: Statistical Methodology | VOL. 85
Shuxiao Chen, et. al.Shuxiao Chen ... Bo Zhang
27 Mar 2023
Journal of the Royal Statistical Society Series B: Statistical Methodology | VOL. 85

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Q‐learning for estimating optimal dynamic treatment rules from observational data

Abstract

Talk to us

Similar Papers

More From: Canadian Journal of Statistics