Abstract
Systematically missing data in distributed data networks presents practical and methodological challenges. Failure to handle it appropriately can bias statistical inference. Multiple imputations can be used to address systematic missingness. However, when data from different study sites cannot be pooled into a unified file, conventional imputation approaches become unavailable due to the absence of a basis for imputation. To address such challenges, we introduce an imputation method based on conditional quantiles – conditional quantile imputation (CQI) – which involves four steps: (i) estimating 99 quantiles for the systematically missing variable in studies with observed data; (ii) deriving a weighted average of regression coefficients across studies and transmitting it to sites with systematically missing data; (iii) imputing the systematically missing values based on observed data and the set of regression coefficients from step ii; and (iv) combining estimates of the substantive outcome model across imputations using Rubin's rules. We evaluate CQI in different simulation scenarios and illustrate it with an applied data example. We conclude that CQI can be a suitable approach for the imputation of systematically missing data when data from multiple studies cannot be pooled.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.