Choosing an analytic approach: key study design considerations in state policy evaluation
Abstract This paper reviews and details methods for state policy evaluation to guide selection of a research approach, based on an evaluation’s setting and available data. We highlight key design considerations for an analysis, including treatment and control group selection, timing of policy adoption, expected effect heterogeneity, and data considerations. We then provide an overview of analytic approaches and differentiate between methods based on an evaluation’s context, such as settings with no control units, a single treated unit, multiple treated units, or with multiple treatment cohorts. Methods discussed include: interrupted time series models, difference-in-differences estimators, autoregressive models, and synthetic control methods, along with method extensions which address issues like staggered policy adoption and heterogenous treatment effects. We end with an illustrative example, applying the developed framework to evaluate the impacts of state-level naloxone standing order policies on overdose rates. Overall, we provide researchers with an approach for deciding on methods for state policy evaluations, which can be used to select study designs and inform methodological choices.
- Research Article
1
- 10.1002/pst.2077
- Oct 28, 2020
- Pharmaceutical Statistics
Clinical trials are primarily conducted to understand the average effects treatments have on patients. However, patients are heterogeneous in the severity of the condition and in ways that affect what treatment effect they can expect. It is therefore important to understand and characterize how treatment effects vary. The design and analysis of clinical studies play critical roles in evaluating and characterizing heterogeneous treatment effects. This panel discussed considerations in design and analysis under the recognition that there are heterogeneous treatment effects across subgroups of patients. Panel members discussed many questions including: What is a good estimate of the treatment effect in me, a 65-year-old, bald, Caucasian-American, male patient? What magnitude of heterogeneity of treatment effects (HTE) is sufficiently large to merit attention? What role can prior evidence about HTE play in confirmatory trial design and analysis? Is there anything described in the 21st Century Cures Act that would benefit from greater attention to HTE? An example of a Bayesian approach addressing multiplicity when testing for treatment effects in subgroups will be provided. We can do more or better at understanding heterogeneous treatment effects and providing the best information on heterogeneous treatment effects.
- Abstract
4
- 10.1016/s0140-6736(17)32961-6
- Nov 1, 2017
- The Lancet
Use of synthetic control methodology for evaluating public health interventions: a literature review
- Research Article
- 10.1177/09622802251316969
- Feb 24, 2025
- Statistical Methods in Medical Research
There has been a renewed interest in identifying heterogenous treatment effects (HTEs) to guide personalized medicine. The objective was to illustrate the use of a step-by-step transparent parametric data-adaptive approach (the generalized HTE approach) based on the G-computation algorithm to detect heterogenous subgroups and estimate meaningful conditional average treatment effects (CATE). The following seven steps implement the generalized HTE approach: Step 1: Select variables that satisfy the backdoor criterion and potential effect modifiers; Step 2: Specify a flexible saturated model including potential confounders and effect modifiers; Step 3: Apply a selection method to reduce overfitting; Step 4: Predict potential outcomes under treatment and no treatment; Step 5: Contrast the potential outcomes for each individual; Step 6: Fit cluster modeling to identify potential effect modifiers; Step 7: Estimate subgroup CATEs. We illustrated the use of this approach using simulated and real data. Our generalized HTE approach successfully identified HTEs and subgroups defined by all effect modifiers using simulated and real data. Our study illustrates that it is feasible to use a step-by-step parametric and transparent data-adaptive approach to detect effect modifiers and identify meaningful HTEs in an observational setting. This approach should be more appealing to epidemiologists interested in explanation.
- Research Article
2
- 10.1177/09622802231224638
- Feb 6, 2024
- Statistical Methods in Medical Research
Estimating treatment (or policy or intervention) effects on a single individual or unit has become increasingly important in health and biomedical sciences. One method to estimate these effects is the synthetic control method, which constructs a synthetic control, a weighted average of control units that best matches the treated unit's pre-treatment outcomes and other relevant covariates. The intervention's impact is then estimated by comparing the post-intervention outcomes of the treated unit and its synthetic control, which serves as a proxy for the counterfactual outcome had the treated unit not experienced the intervention. The augmented synthetic control method, a recent adaptation of the synthetic control method, relaxes some of the synthetic control method's assumptions for broader applicability. While synthetic controls have been used in a variety of fields, their use in public health and biomedical research is more recent, and newer methods such as the augmented synthetic control method are underutilized. This paper briefly describes the synthetic control method and its application, explains the augmented synthetic control method and its differences from the synthetic control method, and estimates the effects of an antimalarial initiative in Mozambique using both the synthetic control method and the augmented synthetic control method to highlight the advantages of using the augmented synthetic control method to analyze the impact of interventions implemented in a single region.
- Conference Article
- 10.1136/jech-2018-ssmabstracts.139
- Sep 1, 2018
Background The synthetic control method (SCM) improves causal inference in non-randomised studies by building a counterfactual using a weighted combination of potential control units. Although it has been widely used in other disciplines it is not widely used in public health research. Our objectives were to identify the use of SCM studies in health and to summarise strengths and limitations identified in the literature. Methods We included studies that used a SCM design to investigate a health outcome of any intervention in any population. We searched for the term ‘synthetic control method’ in 26 health, social science and grey literature databases as well as checking for additional studies by key authors. No restrictions were placed on language or date. Searches were completed in February 2016. We summarised key information about the studies including setting, number of treated and control units, intervention and outcome, number of pre- and post-intervention data-points available and other methods used in the same study. Results Searches identified 35 health-related studies of which 23 were from US settings and investigated a single treated unit. Most studies had at least 10 control units. Interventions investigated included health finance and health systems reform, industry regulation and taxation policies. Common outcomes were mortality rates and insurance rates/health care access. Most studies had more than 4 pre- and post-implementation data-points. SCM is most commonly used alongside difference-in-difference methods. Advantages of SCM are that it does not rely on parallel pre-implementation trends and that it allows for time-varying unmeasured confounders. Limitations include the need for suitable data on both the treated unit and a pool of potential controls, difficulties if the treated unit is an outlier and the inapplicability of traditional statistical tests due to the small number of treated and control units and the fact that they have not been randomly allocated. Falsification tests are generally used as an alternative. Conclusion This comprehensive literature review suggests thatSCM has been little used in health despite some advantages over existing methods. Future research incorporating the method, ideally in combination with other methods, would be of value.
- Research Article
146
- 10.1136/jech-2017-210106
- Jul 12, 2018
- Journal of Epidemiology and Community Health
BackgroundMany public health interventions cannot be evaluated using randomised controlled trials so they rely on the assessment of observational data. Techniques for evaluating public health interventions using observational data include...
- Research Article
105
- 10.1017/pan.2017.15
- Sep 4, 2017
- Political Analysis
Randomized experiments are increasingly used to study political phenomena because they can credibly estimate the average effect of a treatment on a population of interest. But political scientists are often interested in how effects vary across subpopulations—heterogeneous treatment effects—and how differences in the content of the treatment affects responses—the response to heterogeneous treatments. Several new methods have been introduced to estimate heterogeneous effects, but it is difficult to know if a method will perform well for a particular data set. Rather than using only one method, we show how an ensemble of methods—weighted averages of estimates from individual models increasingly used in machine learning—accurately measure heterogeneous effects. Building on a large literature on ensemble methods, we show how the weighting of methods can contribute to accurate estimation of heterogeneous treatment effects and demonstrate how pooling models lead to superior performance to individual methods across diverse problems. We apply the ensemble method to two experiments, illuminating how the ensemble method for heterogeneous treatment effects facilitates exploratory analysis of treatment effects.
- Research Article
28
- 10.1016/j.annepidem.2022.04.009
- Apr 26, 2022
- Annals of Epidemiology
Heterogeneous treatment effects in social policy studies: An assessment of contemporary articles in the health and social sciences
- Research Article
15
- 10.1016/j.puhe.2020.04.007
- Jun 3, 2020
- Public Health
Exploring the effect of Colorado's recreational marijuana policy on opioid overdose rates
- Research Article
- 10.1093/jamiaopen/ooaf146
- Nov 9, 2025
- JAMIA Open
ObjectivesTo identify causal mechanisms driving variations in the impact of rehabilitation treatments on stroke survivors’ independence improvement during rehabilitation inpatient stays.Materials and MethodsIterative cycles of clinical input and causal machine learning (causal ML) were employed toward the goal of identifying relevant heterogeneous treatment effects. Data were from stroke patients (n = 484) seeking to improve independence during inpatient rehabilitation, where treatments provided included sessions (eg, physical therapy) and medication administration.ResultsWe find heterogeneity in rehabilitation treatment effects for a number of patient subgroups. Patient subgroups found to have the most heterogeneity in treatment effects were those with a bilateral involvement stroke location and those with diabetes. In a small minority of cases, we also observe heterogeneous treatment effects for those of older age, males versus females, and stroke location on either the right or left side of the brain. In regard to therapies, those related to mental health (ie, psychotherapy and spiritual/chaplaincy) had the most positive uplift in independence outcomes by the end of inpatient rehabilitation stays.DiscussionStroke survivors have varying responses to stroke rehabilitation treatments. We show that heterogeneous treatment effects are indeed present in rehabilitation. Identification of specific mechanisms, such as stroke location and provisioning of mental health services, is made possible through the use of causal ML applied to observational data in stroke rehabilitation.ConclusionsCausal ML can help to identify the mechanisms driving independence outcome variation. However, the large number of effects discovered and the small size of many effects make clinician feedback of paramount importance. Use of causal ML with clinician feedback throughout the process improves identification of appropriate measures and selection of relevant results.
- Research Article
1028
- 10.1257/jep.31.2.3
- May 1, 2017
- Journal of Economic Perspectives
In this paper, we discuss recent developments in econometrics that we view as important for empirical researchers working on policy evaluation questions. We focus on three main areas, in each case, highlighting recommendations for applied work. First, we discuss new research on identification strategies in program evaluation, with particular focus on synthetic control methods, regression discontinuity, external validity, and the causal interpretation of regression methods. Second, we discuss various forms of supplementary analyses, including placebo analyses as well as sensitivity and robustness analyses, intended to make the identification strategies more credible. Third, we discuss some implications of recent advances in machine learning methods for causal effects, including methods to adjust for differences between treated and control units in high-dimensional settings, and methods for identifying and estimating heterogenous treatment effects.
- Research Article
- 10.1080/14737167.2025.2482661
- Mar 28, 2025
- Expert Review of Pharmacoeconomics & Outcomes Research
Background With 10.95 million cases (11 March 2020-9 February 2022), Italy was massively hit by the coronavirus disease 2019 (COVID-19) pandemic. Most of the COVID-19-related inpatient discharges were codified under the Diagnosis-Related Group (DRG) 79. During 2019–2021, DRG 79 inpatient discharges increased from 20,377 to 130,580 (+540.82%). Research design and methods To investigate the causal relationship between DRG 79 inpatient discharges and COVID-19, the synthetic control method (SCM) compared the real with the counterfactual DRG 79. The latter was a weighted combination of control units (22 DRGs unrelated to COVID-19). The SCM mimicked the trajectory of DRG 79 in the absence of COVID-19. Placebo studies and robustness test investigated the reliability of the baseline findings. Results Six out of the 22 control units contribute to the counterfactual DRG 79. The real and the counterfactual DRG 79 cease to overlap from 2019 onward. Placebo studies and robustness test confirm the causal relationship of COVID-19 with the increased number of inpatient discharges coded under DRG 79 during 2019–2021. Conclusion The SCM identifies a causal link between COVID-19 and DRG 79 in Italy. Hopefully, future contributions will utilize SCM (and causal inference in general) in health care decision-making within the Italian National Health Service.
- Research Article
4
- 10.1111/aas.14167
- Nov 8, 2022
- Acta Anaesthesiologica Scandinavica
Corticosteroids improve outcomes in patients with severe COVID-19. In the COVID STEROID 2 randomised clinical trial, we found high probabilities of benefit with dexamethasone 12 versus 6 mg daily. While no statistically significant heterogeneity in treatment effects (HTE) was found in the conventional, dichotomous subgroup analyses, these analyses have limitations, and HTE could still exist. We assessed whether HTE was present for days alive without life support and mortality at Day 90 in the trial according to baseline age, weight, number of comorbidities, category of respiratory failure (type of respiratory support system and oxygen requirements) and predicted risk of mortality using an internal prediction model. We used flexible models for continuous variables and logistic regressions for categorical variables without dichotomisation of the baseline variables of interest. HTE was assessed both visually and with p and S values from likelihood ratio tests. There was no strong evidence for substantial HTE on either outcome according to any of the baseline variables assessed with all p values >.37 (and all S values <1.43) in the planned analyses and no convincingly strong visual indications of HTE. We found no strong evidence for HTE with 12 versus 6 mg dexamethasone daily on days alive without life support or mortality at Day 90 in patients with COVID-19 and severe hypoxaemia, although these results cannot rule out HTE either.
- Research Article
14
- 10.1515/jci-2020-0013
- Dec 19, 2020
- Journal of Causal Inference
In some applications, researchers using the synthetic control method (SCM) to evaluate the effect of a policy may struggle to determine whether they have identified a “good match” between the control group and treated group. In this paper, we demonstrate the utility of the mean and maximum Absolute Standardized Mean Difference (ASMD) as a test of balance between a synthetic control unit and treated unit, and provide guidance on what constitutes a poor fit when using a synthetic control. We explore and compare other potential metrics using a simulation study. We provide an application of our proposed balance metric to the 2013 Los Angeles (LA) Firearm Study [9]. Using Uniform Crime Report data, we apply the SCM to obtain a counterfactual for the LA firearm-related crime rate based on a weighted combination of control units in a donor pool of cities. We use this counterfactual to estimate the effect of the LA Firearm Study intervention and explore the impact of changing the donor pool and pre-intervention duration period on resulting matches and estimated effects. We demonstrate how decision-making about the quality of a synthetic control can be improved by using ASMD. The mean and max ASMD clearly differentiate between poor matches and good matches. Researchers need better guidance on what is a meaningful imbalance between synthetic control and treated groups. In addition to the use of gap plots, the proposed balance metric can provide an objective way of determining fit.
- Book Chapter
- 10.4337/9781800376878.00047
- Jul 15, 2022
Researchers conducting business-to-business marketing research using observational data face issues in establishing a causal relationship between marketing decisions and outcomes. This chapter discusses methods that help marketing scholars in establishing causality in B2B marketing research. First, the authors present causes of endogeneity that hinder researchers in establishing causality. Second, the authors discuss well-established endogeneity correction methods and recent developments in correcting endogeneity. In doing so, they first discuss instrumental variable and instrument-free endogeneity correction methods. Then, they discuss quasi-experimental methods - matching methods, difference-in-difference analysis, synthetic control methods, and regression discontinuity design to correct endogeneity issues. Finally, the authors also discuss endogeneity concerns related to peer effects and the causal forest model to explain the heterogeneity in treatment effects. The chapter also provides STATA or R codes wherever possible to help marketing researchers implement the endogeneity correction models.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.