Abstract
Dynamic treatment regimes are of growing interest across the clinical sciences because these regimes provide one way to operationalize and thus inform sequential personalized clinical decision making. Formally, a dynamic treatment regime is a sequence of decision rules, one per stage of clinical intervention. Each decision rule maps up-to-date patient information to a recommended treatment. We briefly review a variety of approaches for using data to construct the decision rules. We then review a critical inferential challenge that results from nonregularity, which often arises in this area. In particular, nonregularity arises in inference for parameters in the optimal dynamic treatment regime; the asymptotic, limiting, distribution of estimators are sensitive to local perturbations. We propose and evaluate a locally consistent Adaptive Confidence Interval (ACI) for the parameters of the optimal dynamic treatment regime. We use data from the Adaptive Pharmacological and Behavioral Treatments for Children with ADHD Trial as an illustrative example. We conclude by highlighting and discussing emerging theoretical problems in this area.
Highlights
Treatment BB Intensify MEDS dence intervals or to carry out hypothesis testing
We propose and evaluate a locally consistent Adaptive Confidence Interval (ACI) for the parameters of the optimal dynamic treatment regime
We observe a time-ordered trajectory (X1, A1, X2, A2, X3) where: X1 denotes baseline subject information; A1 denotes the initial treatment, coded to take values in {0, 1}; X2 denotes subject information collected during the course of the first treatment but prior to the second treatment;A2 denotes the second treatment, coded to take values in {0, 1}; X3 denotes subject information collected during the course of the second treatment
Summary
Throughout, we consider the setting in which there are two stages of binary treatment; this simple setting is sufficient to illustrate the salient theoretical challenges. This is the case in the ADHD example, in which Y is a reversecoded rating of the child’s impairment at the end of the last month of the study In this two stage setting, a DTR is a pair of decision rules π = (π1, π2), where πt : dom(Ht) → dom(At) so that a patient presenting at time t with Ht = ht is assigned treatment πt(ht). Indirect estimation methods use approximate dynamic programming with parametric, semiparametric, or nonparametric methods to first estimate models for the conditional mean or other aspects of the conditional distributions of the outcomes Y1, Y2, Y , (for example, models for the Q-function) and from these models infer the optimal DTR. We use Q-learning to illustrate how this nonsmoothness impacts the sampling distributions of DTR estimators
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.