Abstract

Dynamic treatment regimes are of growing interest across the clinical sciences because these regimes provide one way to operationalize and thus inform sequential personalized clinical decision making. Formally, a dynamic treatment regime is a sequence of decision rules, one per stage of clinical intervention. Each decision rule maps up-to-date patient information to a recommended treatment. We briefly review a variety of approaches for using data to construct the decision rules. We then review a critical inferential challenge that results from nonregularity, which often arises in this area. In particular, nonregularity arises in inference for parameters in the optimal dynamic treatment regime; the asymptotic, limiting, distribution of estimators are sensitive to local perturbations. We propose and evaluate a locally consistent Adaptive Confidence Interval (ACI) for the parameters of the optimal dynamic treatment regime. We use data from the Adaptive Pharmacological and Behavioral Treatments for Children with ADHD Trial as an illustrative example. We conclude by highlighting and discussing emerging theoretical problems in this area.

Highlights

  • Treatment BB Intensify MEDS dence intervals or to carry out hypothesis testing

  • We propose and evaluate a locally consistent Adaptive Confidence Interval (ACI) for the parameters of the optimal dynamic treatment regime

  • We observe a time-ordered trajectory (X1, A1, X2, A2, X3) where: X1 denotes baseline subject information; A1 denotes the initial treatment, coded to take values in {0, 1}; X2 denotes subject information collected during the course of the first treatment but prior to the second treatment;A2 denotes the second treatment, coded to take values in {0, 1}; X3 denotes subject information collected during the course of the second treatment

Read more

Summary

Review of methods for constructing dynamic treatment regimes

Throughout, we consider the setting in which there are two stages of binary treatment; this simple setting is sufficient to illustrate the salient theoretical challenges. This is the case in the ADHD example, in which Y is a reversecoded rating of the child’s impairment at the end of the last month of the study In this two stage setting, a DTR is a pair of decision rules π = (π1, π2), where πt : dom(Ht) → dom(At) so that a patient presenting at time t with Ht = ht is assigned treatment πt(ht). Indirect estimation methods use approximate dynamic programming with parametric, semiparametric, or nonparametric methods to first estimate models for the conditional mean or other aspects of the conditional distributions of the outcomes Y1, Y2, Y , (for example, models for the Q-function) and from these models infer the optimal DTR. We use Q-learning to illustrate how this nonsmoothness impacts the sampling distributions of DTR estimators

Q-learning
Asymptotic bias
Confidence intervals
A projection interval
Adaptive confidence intervals
Theoretical results
Experiments
Analysis of the ADHD study
Conclusion
Proof of Theorems in Section 3
Proof of Theorems in Section 4
Findings
Results for second stage parameters

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.