Abstract

BackgroundMultivariable confounder adjustment in comparative studies of newly marketed drugs can be limited by small numbers of exposed patients and even fewer outcomes. Disease risk scores (DRSs) developed in historical comparator drug users before the new drug entered the market may improve adjustment. However, in a high dimensional data setting, empirical selection of hundreds of potential confounders and modeling of DRS even in the historical cohort can lead to over-fitting and reduced predictive performance in the study cohort. We propose the use of combinations of dimension reduction and shrinkage methods to overcome this problem, and compared the performances of these modeling strategies for implementing high dimensional (hd) DRSs from historical data in two empirical study examples of newly marketed drugs versus comparator drugs after the new drugs’ market entry—dabigatran versus warfarin for the outcome of major hemorrhagic events and cyclooxygenase-2 inhibitor (coxibs) versus nonselective non-steroidal anti-inflammatory drugs (nsNSAIDs) for gastrointestinal bleeds.ResultsHistorical hdDRSs that included predefined and empirical outcome predictors with dimension reduction (principal component analysis; PCA) and shrinkage (lasso and ridge regression) approaches had higher c-statistics (0.66 for the PCA model, 0.64 for the PCA + ridge and 0.65 for the PCA + lasso models in the warfarin users) than an unreduced model (c-statistic, 0.54) in the dabigatran example. The odds ratio (OR) from PCA + lasso hdDRS-stratification [OR, 0.64; 95 % confidence interval (CI) 0.46–0.90] was closer to the benchmark estimate (0.93) from a randomized trial than the model without empirical predictors (OR, 0.58; 95 % CI 0.41–0.81). In the coxibs example, c-statistics of the hdDRSs in the nsNSAID initiators were 0.66 for the PCA model, 0.67 for the PCA + ridge model, and 0.67 for the PCA + lasso model; these were higher than for the unreduced model (c-statistic, 0.45), and comparable to the demographics + risk score model (c-statistic, 0.67).ConclusionshdDRSs using historical data with dimension reduction and shrinkage was feasible, and improved confounding adjustment in two studies of newly marketed medications.Electronic supplementary materialThe online version of this article (doi:10.1186/s12982-016-0047-x) contains supplementary material, which is available to authorized users.

Highlights

  • Multivariable confounder adjustment in comparative studies of newly marketed drugs can be limited by small numbers of exposed patients and even fewer outcomes

  • The number of individuals exposed to the new drug and the number of outcomes may be limited in the early marketing phase, there often exist many individuals exposed to the comparator product in the period preceding market entry of the new drug

  • Because we evaluated the performance of the high dimensional disease risk score (hdDRS) models in two examples using two large administrative databases from the US, the results may not generalize to other study settings where the number of outcomes in the historical cohort is different, the number of potential confounders may be different, or where coding practices or clinical practice evolve in different patterns

Read more

Summary

Introduction

Multivariable confounder adjustment in comparative studies of newly marketed drugs can be limited by small numbers of exposed patients and even fewer outcomes. While DRSs offer similar dimension reduction benefits as PSs and have an important balancing property distinct from that of the PS for alternatively treated patients [11], empirical selection and inclusion of hundreds of potential confounders into the DRS estimation model will lead to over-fitting in the historical cohort and reduced predictive performance in the study cohort. In order to stably estimate historical high-dimensional DRSs (hdDRSs) with large numbers of variables, we propose the use of dimension reduction via principal component analysis and shrinkage with ridge and lasso regression These techniques have been used often for prediction modeling in genetic epidemiology [12,13,14], but less frequently in clinical and pharmaco-epidemiology

Objectives
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.