Abstract

Data missingness is a major challenge in clinical trials and observational real-world healthcare research.1 Studies often exclude records which contain incomplete data, but this strategy can introduce bias. Therefore, we evaluated the utility of different techniques for imputing missing data. Because biases tend to differ between randomized controlled trial and observational datasets,2 we conducted this assessment in both clinical trial and real-world datasets for Acute Myeloid Leukemia (AML). Evaluations of this sort are challenged by data presented in variable formats. 1 Thus, we used a standard data model to facilitate meaningful comparison across data types. Clinical trial data was derived from a pooled dataset of 7 clinical trials (n=719) for relapsed/refractory AML from a 2012-2017 Medidata archive of trials, created using CDISC SDTM. Real-world data was obtained from a US-based geographically representative oncology-focused electronic medical record dataset. We converted all datasets to the OMOP Common Data Model (v5) and used the SHYFT Strata platform to standardize application of imputation methods. We then artificially introduced missingness into the datasets. We applied widely-used3 imputation methods, including Predictive Mean Matching, K Nearest Neighbor, Iterative Imputer, and MICE to repeatedly generate values for the missing data. We evaluated quality of imputed values using standard approaches including RMSE, unsupervised classification error (UCE), and AUC scores. We observed variation in imputation method performance both within and between data sources consistent with previous reports. The use of a standardized data model enabled us to provide a robust evaluation of strategies and make reliable comparisons between the results on a faster timeline. Imputation techniques can significantly improve the informativeness of HEOR when appropriate methods are tested and applied. Clinical data standards such as the OMOP CDM are well suited to enable rigorous and repeatable methodological evaluations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.