A More Credible Approach to Parallel Trends
Abstract This paper proposes tools for robust inference in difference-in-differences and event-study designs where the parallel trends assumption may be violated. Instead of requiring that parallel trends holds exactly, we impose restrictions on how different the post-treatment violations of parallel trends can be from the pre-treatment differences in trends (“pre-trends”). The causal parameter of interest is partially identified under these restrictions. We introduce two approaches that guarantee uniformly valid inference under the imposed restrictions, and we derive novel results showing that they have desirable power properties in our context. We illustrate how economic knowledge can inform the restrictions on the possible violations of parallel trends in two economic applications. We also highlight how our approach can be used to conduct sensitivity analyses showing what causal conclusions can be drawn under various restrictions on the possible violations of the parallel trends assumption.
- Research Article
- 10.1002/sim.70459
- Mar 1, 2026
- Statistics in medicine
Difference-in-differences (DID) is popular because it can allow for unmeasured confounding when the key assumption of parallel trends holds. However, there exists little guidance on how to decide a priori whether this assumption is reasonable. We attempt to develop such guidance by considering the relationship between a causal diagram and the parallel trends assumption. This is challenging because parallel trends is scale-dependent and causal diagrams are generally scale-independent. We develop conditions under which, given a nonparametric causal diagram, one can reject or fail to reject parallel trends. In particular, we adopt a linear faithfulness assumption, which states that all graphically connected variables are correlated, and which is often reasonable in practice. We show that parallel trends can be rejected if either (i) the treatment is affected by pre-treatment outcomes, or (ii) there exist unmeasured confounders for the effect of treatment on pre-treatment outcomes that are not confounders for the post-treatment outcome, or vice versa. We also argue that parallel trends should be strongly questioned if (iii) the pre-treatment outcomes causally affect the post-treatment outcomes, since there exist reasonable semiparametric models in which such an effect violates parallel trends. When (i-iii) are absent, a necessary and sufficient condition for parallel trends is that the association between unmeasured confounders and potential outcomes is constant on an additive scale, pre- and post-treatment. We discuss our approach in the context of the effect of Medicaid expansion under the US Affordable Care Act on health insurance coverage rates.
- Research Article
6
- 10.57017/jorit.v2.2(4).07
- Dec 1, 2023
- Journal of Research, Innovation and Technologies (JoRIT)
Traditional tests for parallel trends in the context of differences-in-differences are based on the observation of the mean values of the dependent variable in the treatment and control groups over time. However, given the new discussions brought by the development of the event study designs, controlling for observable factors may intervene in the fulfilment of the parallel trend assumption. This article presents a simple test based on the statistical significance of pre-treatment periods which can be extended from the classic Differences-in-Differences up to event study designs in universal absorbing treatments. The test requires at least two pre-treatment periods and can done by constructing appropriate dummy variables.© 2023 The Author(s). Published by RITHA Publishing. This article is distributed under the terms of the license CC-BY 4.0., which permits any further distribution in any medium, provided the original work is properly cited.
- Research Article
162
- 10.1177/0962280218814570
- Nov 25, 2018
- Statistical Methods in Medical Research
Difference-in-differences (DID) analysis is used widely to estimate the causal effects of health policies and interventions. A critical assumption in DID is "parallel trends": that pre-intervention trends in outcomes are the same between treated and comparison groups. To date, little guidance has been available to researchers who wish to use DID when the parallel trends assumption is violated. Using a Monte Carlo simulation experiment, we tested the performance of several estimators (standard DID; DID with propensity score matching; single-group interrupted time-series analysis; and multi-group interrupted time-series analysis) when the parallel trends assumption is violated. Using nationwide data from US hospitals (n = 3737) for seven data periods (four pre-interventions and three post-interventions), we used alternative estimators to evaluate the effect of a placebo intervention on common outcomes in health policy (clinical process quality and 30-day risk-standardized mortality for acute myocardial infarction, heart failure, and pneumonia). Estimator performance was assessed using mean-squared error and estimator coverage. We found that mean-squared error values were considerably lower for the DID estimator with matching than for the standard DID or interrupted time-series analysis models. The DID estimator with matching also had superior performance for estimator coverage. Our findings were robust across all outcomes evaluated.
- Research Article
18
- 10.1177/26320843211061306
- Sep 1, 2021
- Research Methods in Medicine & Health Sciences
Background. Difference-in-Difference makes a critical assumption that the changes in the outcomes, over the post-treatment period, are similar between the treated and control groups—the parallel trends assumption. Evaluation of this assumption is often done either by graphical examination or by statistical tests in the pre-treatment period. They result in a binary conclusion about the validity of the assumption. Purpose. This paper proposes a sensitivity analysis that quantifies the departure from parallel trends necessary to meaningfully change the estimated treatment effect. Results. Sensitivity analyses have an advantage over traditional parallel trends tests: they use all available data and thereby work even if only one pre-period is available, and they quantify the strength of unobserved confounder(s) required to change the conclusions of a study. Conclusions. We apply the sensitivity analysis metrics developed by Cinelli and Hazlett (2020) and illustrate them on two studies.
- Research Article
16
- 10.1001/jamanetworkopen.2021.38983
- Dec 15, 2021
- JAMA Network Open
Access to postpartum care is restricted for low-income women who are recent or undocumented immigrants enrolled in Emergency Medicaid. To examine the association of a policy extending postpartum coverage to Emergency Medicaid recipients with attendance at postpartum visits and use of postpartum contraception. This cohort study linked Medicaid claims and birth certificate data from 2010 to 2019 to examine changes in postpartum care coverage on postpartum care and contraception use. A difference-in-difference design was used to compare the rollout of postpartum coverage in Oregon with a comparison state, South Carolina, which did not cover postpartum care. The study used 2 distinct assumptions to conduct the analyses: first, preintervention differences in postpartum visit attendance and contraceptive use would have remained constant if the policy expanding coverage had not been passed (parallel trends assumption), and second, differences in preintervention trends would have continued without the policy change (differential trend assumption). Data analysis was performed from September 2020 to October 2021. Medicaid coverage of postpartum care. Attendance at postpartum visits and postpartum contraceptive use, defined as receipt of any contraceptive method within 60 days of delivery. The study population consisted of 27 667 live births among 23 971 women (mean [SD] age, 29.4 [6.0] years) enrolled in Emergency Medicaid. The majority of all births were to multiparous women (21 289 women [76.9%]; standardized mean difference [SMD] = 0.08) and were delivered vaginally (20 042 births [72.4%]; SMD = 0.03) and at term (25 502 births [92.2%]; SMD = 0.01). Following Oregon's expansion of postpartum coverage to women in Emergency Medicaid, there was a large and significant increase in postpartum care visits and contraceptive use. Assuming parallel trends, postpartum care attendance increased by 40.6 percentage points (95% CI, 34.1-47.1 percentage points; P < .001) following the policy change. Under the differential trends assumption, postpartum visits increased by 47.9 percentage points (95% CI, 41.3-54.6 percentage points; P < .001). Postpartum contraception use increased similarly. Under the parallel trends assumption, postpartum contraception within 60 days increased by 33.2 percentage points (95% CI, 31.1-35.4 percentage points; P < .001). Assuming differential trends, postpartum contraception increased by 28.2 percentage points (95% CI, 25.8-30.6 percentage points; P < .001). These findings suggest that expanding Emergency Medicaid benefits to include postpartum care is associated with significant improvements in receipt of postpartum care and contraceptive use.
- Research Article
6
- 10.1111/biom.13862
- Mar 29, 2023
- Biometrics
Many research questions in public health and medicine concern sustained interventions in populations defined by substantive priorities. Existing methods to answer such questions typically require a measured covariate set sufficient to control confounding, which can be questionable in observational studies. Differences-in-differences rely instead on the parallel trends assumption, allowing for some types of time-invariant unmeasured confounding. However, most existing difference-in-differences implementations are limited to point treatments in restricted subpopulations. We derive identification results for population effects of sustained treatments under parallel trends assumptions. In particular, in settings where all individuals begin follow-up with exposure status consistent with the treatment plan of interest but may deviate at later times, a version of Robins' g-formula identifies the intervention-specific mean under stable unit treatment value assumption, positivity, and parallel trends. We develop consistent asymptotically normal estimators based on inverse-probability weighting, outcome regression, and a double robust estimator based on targeted maximum likelihood. Simulation studies confirm theoretical results and support the use of the proposed estimators at realistic sample sizes. As an example, the methods are used to estimate the effect of a hypothetical federal stay-at-home order on all-cause mortality during the COVID-19 pandemic in spring 2020 in the UnitedStates.
- Research Article
172
- 10.1086/711509
- Sep 3, 2020
- Journal of the Association of Environmental and Resource Economists
Difference-in-differences (DID) research designs usually rely on variation of treatment timing such that, after making an appropriate parallel trends assumption, one can identify, estimate, and make inference about causal effects. In practice, however, different DID procedures rely on different parallel trends assumptions (PTAs), and recover different causal parameters. In this paper, we focus on staggered DID (also referred as event studies) and discuss the role played by the PTA in terms of identification and estimation of causal parameters. We document a “robustness” versus “efficiency” trade-off in terms of the strength of the underlying PTA and argue that practitioners should be explicit about these trade-offs whenever using DID procedures. We propose new DID estimators that reflect these trade-offs and derive their large sample properties. We illustrate the practical relevance of these results by assessing whether the transition from federal to state management of the Clean Water Act affects compliance rates.
- Research Article
77
- 10.1017/pan.2019.25
- Jul 11, 2019
- Political Analysis
Difference-in-differences is a widely used evaluation strategy that draws causal inference from observational panel data. Its causal identification relies on the assumption of parallel trends, which is scale-dependent and may be questionable in some applications. A common alternative is a regression model that adjusts for the lagged dependent variable, which rests on the assumption of ignorability conditional on past outcomes. In the context of linear models, Angrist and Pischke (2009) show that the difference-in-differences and lagged-dependent-variable regression estimates have a bracketing relationship. Namely, for a true positive effect, if ignorability is correct, then mistakenly assuming parallel trends will overestimate the effect; in contrast, if the parallel trends assumption is correct, then mistakenly assuming ignorability will underestimate the effect. We show that the same bracketing relationship holds in general nonparametric (model-free) settings. We also extend the result to semiparametric estimation based on inverse probability weighting. We provide three examples to illustrate the theoretical results with replication files in Ding and Li (2019).
- Research Article
1
- 10.17657/jcr.2024.10.31.6
- Oct 1, 2024
- Journal of Channel and Retailing
Purpose: The purpose of this study is to compare and analyze the strengths and weaknesses of three widely used quasi-experimental methods in causal inference research: Difference-in-Differences (DID), Synthetic Control Method (SCM), and Synthetic Difference-in-Differences (SDID). The study specifically examines the impact of Early-Morning delivery services on existing offline commercial districts as a case study to empirically evaluate the accuracy and applicability of each method. Research design, data, and methodology: To compare the three methodologies DID, SCM, and SDID this study analyzed the impact of Early-Morning delivery services on offline supermarket sales. The treatment group consisted of the Seo-gu and Yuseong-gu districts in Daejeon, while the control group included selected districts in Gwangju and Busan. Using daily credit card panel data, the analysis covered sales changes from May 2, 2019, two years before the entry of dawn delivery services, to October 31, 2021, six months after their introduction. The DID method compared post-treatment sales changes between the treatment and control groups. The SCM method created a synthetic control group by assigning weights to the control group to closely match the treatment group during the pre-treatment period, then compared post-treatment sales changes between the treatment and synthetic control groups. Similarly, the SDID method generated a synthetic control group with the weights of SCM and the time weights to minimize differences between the pre- and post-treatment periods, and then compared post-treatment sales changes between the treatment and synthetic control groups. This approach allowed for the analysis of treatment effects based on differences observed after the intervention among the treatment, control, and synthetic control groups. For analyses based on the number of control groups, selections were made in both similar and dissimilar orders by comparing the pre-treatment trends of Seo-gu and Yuseong-gu in Daejeon with those of potential control groups using Euclidean distance. The comparative analysis based on the pre-entry period fixed the post-treatment period to 6 months after the entry of the dawn delivery service, while the pre-treatment period was divided into 2 years, 1 year, 6 months, and 3 months before the entry. This setup allowed the study to examine cases where the pre-treatment period was longer than, equal to, or shorter than the post-treatment period. Results: The analysis results indicated that the introduction of Early-Morning delivery services had a negative impact on offline supermarket sales across all three models: DID, SCM, and SDID. However, the SCM and SDID models, by constructing synthetic control groups, maximized the similarity between the treatment and control groups, resulting in more reliable counterfactual estimates and providing more stable estimates than the DID model. This is because DID may produce biased results if the pre-trends of the treatment and control groups are not similar. Specifically, the SCM model delivered stable results even when the number of control groups was small and when the parallel trend assumption was not met. The SDID model produced more stable results as the pre-intervention period extended. Conclusions: In conclusion, the DID model is appropriate when the parallel trend assumption is satisfied; however, when this assumption is not met, the SCM or SDID models are more suitable. SCM and SDID are particularly effective in scenarios where the parallel trend is not satisfied or the number of control groups is small. Additionally, the SDID model has proven to be useful for analyzing long-term data. The study highlights that the performance of each method can vary depending on the situation, emphasizing the importance of selecting the appropriate methodology based on the specific circumstances of the analysis. Consequently, these models contribute to more accurate causal inference in policy evaluation and can aid in decision-making and strategy development across various fields.
- Research Article
15
- 10.1017/s0003055425000243
- Jun 9, 2025
- American Political Science Review
Two-way fixed effects (TWFE) models are widely used in political science to establish causality, but recent methodological discussions highlight their limitations under heterogeneous treatment effects (HTE) and violations of the parallel trends (PT) assumption. This growing literature has introduced numerous new estimators and procedures, causing confusion among researchers about the reliability of existing results and best practices. To address these concerns, we replicated and reanalyzed 49 studies from leading journals that employ TWFE models for causal inference using observational panel data with binary treatments. Using six HTE-robust estimators, diagnostic tests, and sensitivity analyses, we find: (i) HTE-robust estimators yield qualitatively similar but highly variable results; (ii) while a few studies show clear signs of PT violations, many lack evidence to support this assumption; and (iii) many studies are underpowered when accounting for HTE and potential PT violations. We emphasize the importance of strong research designs and rigorous validation of key identifying assumptions.
- Research Article
66
- 10.1001/jamainternmed.2017.7455
- Jan 16, 2018
- JAMA Internal Medicine
In 2014, the State of Maryland placed the majority of its hospitals under all-payer global budgets for inpatient, hospital outpatient, and emergency department care. Goals of the program included reducing unnecessary hospital utilization and encouraging greater use of primary care. To compare changes in hospital and primary care use through the first 2 years of Maryland's hospital global budget program among fee-for-service Medicare beneficiaries in Maryland vs matched control areas. We matched 8 Maryland counties (94 967 beneficiaries) with hospitals in the program to 27 non-Maryland control counties (206 389 beneficiaries). Using difference-in-differences analysis, we compared changes in hospital and primary care use in Maryland vs the control counties from before (2009-2013) to after (2014-2015) the payment change, using 2 different assumptions. First, we assumed that preintervention differences between Maryland and the control counties would have remained constant past 2014 had Maryland not implemented global budgets (parallel trend assumption). Second, we assumed that differences in preintervention trends would have continued without the payment change (differential trend assumption). Hospital stays (defined as admissions and observation stays); return hospital stays within 30 days of a prior hospital stay; emergency department visits that did not result in admission; price-standardized hospital outpatient department (HOPD) utilization; and visits with primary care physicians (overall and within 7 days of a hospital stay). We matched 8 Maryland counties with hospitals in the program (94 967 beneficiaries; 41.8% male; mean [SD] age, 72.3 [12.2] years) to 27 non-Maryland control counties (206 389 beneficiaries; 42.8% male; mean [SD] age, 71.7 [12.5] years). Assuming parallel trends, we estimated a differential change in Maryland of -0.47 annual hospital stays per 100 beneficiaries (95% CI, -1.65 to 0.72; P = .43) from the preintervention period (2009-2013) to 2015, but assuming differential trends, we estimated a differential change in Maryland of -1.24 stays per 100 beneficiaries (95% CI, -2.46 to -0.02; P = .047). Assuming parallel trends, we found a significant increase in primary care visits (+10.6 annual visits/100 beneficiaries; 95% CI, 4.6 to 16.6 annual visits/100 beneficiaries; P = .001), but assuming differential trends, we found no change (-0.8 visits/100 beneficiaries; 95% CI, -10.6 to 9.0 visits/100 beneficiaries; P = .87). Comparing estimates with both trend assumptions, we found no consistent changes in emergency department visits, return hospital stays, HOPD use, or posthospitalization primary care visits associated with Maryland's program. We did not find consistent evidence that Maryland's hospital global budget program was associated with reductions in hospital use or increases in primary care visits among fee-for-service Medicare beneficiaries after 2 years. Evaluations over longer periods should be pursued.
- Discussion
6
- 10.1016/j.ajog.2022.05.041
- May 22, 2022
- American journal of obstetrics and gynecology
Buprenorphine uptake during pregnancy following the 2017 guidelines update on prenatal opioid use disorder
- Research Article
3
- 10.1136/bmjopen-2024-083927
- May 1, 2024
- BMJ Open
ObjectivesTo assess the reporting and methodological quality of early-life policy intervention papers that applied difference-in-differences (DiD) analysis.Study designSystematic review.Data sourcesPapers applying DiD of early-life policy interventions in high-income countries as...
- Research Article
- 10.1162/rest_a_01553
- Jan 28, 2025
- Review of Economics and Statistics
A key assumption of the differences-in-differences designs is that the average evolution of untreated potential outcomes is the same across different treatment cohorts: parallel trend assumption. In this paper, we relax the parallel trend assumption by assuming a latent type variable and developing a type-specific parallel trend. With a finite support assumption on the latent type and long pretreatment time periods, an extremum classifier consistently estimates the type assignment. Based on the classification, we propose a type-specific DiD estimator for type-specific ATT. By estimating the type-specific ATT, we study heterogeneity in treatment effect, in addition to heterogeneity in baseline outcomes.
- Research Article
1
- 10.1200/jco.2020.38.15_suppl.7035
- May 20, 2020
- Journal of Clinical Oncology
7035 Background: Medicaid expansion has been associated with increased access to care and earlier stage at diagnosis among patients with head and neck cancer (HNC). However, it is unclear whether Medicaid expansion has impacted HNC mortality rates. We examined the associations between early Medicaid expansions (2010-2011) with mortality rates for HNC in the United States. Methods: Data were obtained from the Surveillance, Epidemiology, and End Results (SEER) program. SEER*Stat was utilized to obtain mortality rates for early expansion (CA, CT, DC, MN, NJ, and WA) and non-early expansion states (all others) in the year ranges as available in SEER: 2005-2007 (pre-expansion) and 2012-2016 (post-expansion). Deaths in 2008-2011 were excluded as a phase-in/washout period. Difference-in-differences analyses were utilized to compare mortality rates pre- and post-early expansion in early expansion vs. non-early expansion states. The parallel trends assumption was tested comparing changes in HNC mortality rates between early expansion and non-early expansion states from 2002-2004 to 2005-2007 and from 2005-2007 to 2008-2011. Results: There were 6882 and 35459 deaths due to HNC in early expansion and non-early expansion states, respectively. HNC mortality rates (deaths per 100,000) decreased from 2005-2007 to 2012-2016 in both early expansion (2.17 to 1.85, difference = -0.32, 95% CI = -0.42 to -0.22) and non-expansion states (2.59 to 2.43, difference = -0.16, 95% CI = -0.22 to -0.11). Relative to non-expansion states, there was a reduction of 0.16 deaths per 100,000 (95% CI = 0.05 to 0.27, p = 0.007) after early Medicaid expansion in expansion states. However, in parallel trends testing, there was no difference in the change in mortality rates between early expansion and non-expansion states from 2002-2011 (p > 0.37). Conclusions: In this quasi-experimental analysis, there was an association between early Medicaid expansion with decreased HNC mortality. Thus, Medicaid expansion might help decrease disparities associated with access to care among HNC survivors. As longer-term data emerges, additional follow-up will be necessary to understand the mechanisms that underlie the HNC mortality benefits seen in early Medicaid expansion.