Methods for matching in longitudinal cohort studies, such as sequential stratification and time-varying propensity scores, facilitate causal inferences in the context of time-dependent treatments that are not randomized where patient eligibility or treatment status changes over time. The tradeoffs in available approaches have not been compared previously, so we compare two methods using simulations based on a retrospective cohort of patients eligible for weight loss surgery, some of whom received it. This study compares matching completeness, bias, coverage, and precision among three approaches to longitudinal matching: (1) time-varying propensity scores (tvPS), (2) sequential stratification that matches exactly on all covariates used in tvPS (SS-Full) and (3) sequential stratification that exact matches on a subset of covariates (SS-Selected). These comparisons are made in the context of a deep sampling frame (50:1) and a shallow sampling frame (5:1) of eligible comparators. A simulation study was employed to estimate the relative performance of these approaches. In 1,000 simulations each, tvPS retained more than 99.9% of treated patients in both the deep and shallow sampling frames, while a smaller proportion of treated patients were retained for SS-Full (91.6%) and SS-Selected (98.2%) in the deep sampling frame. In the shallow sampling frame, sequential stratification retained many fewer treated patients (73.9% SS-Full, 92.0% SS-Selected) than tvPS yet coverage, precision and bias were comparable for tvPS, SS-Full and SS-Selected in the deep and shallow sampling frames. Time-varying propensity scores have comparable performance to sequential stratification in terms of coverage, bias, and precision, with superior match completeness. While performance was generally comparable across methods, greater match completeness makes tvPS an attractive option for longitudinal matching studies where external validity is highly valued.
Read full abstract