Abstract
Objectives. To describe and compare 3 garbage code (GC) redistribution models: naïve Bayes classifier (NB), coarsened exact matching (CEM), and multinomial logistic regression (MLR).Methods. We analyzed Taiwan Vital Registration data (2008-2016) using a 2-step approach. First, we used non-GC death records to evaluate 3 different prediction models (NB, CEM, and MLR), incorporating individual-level information on multiple causes of death (MCDs) and demographic characteristics. Second, we applied the best-performing model to GC death records to predict the underlying causes of death. We conducted additional simulation analyses for evaluating the predictive performance of models.Results. When we did not account for MCDs, all 3 models presented high average misclassification rates in GC assignment (NB, 81%; CEM, 86%; MLR, 81%). In the presence of MCD information, NB and MLR exhibited significant improvement in assignment accuracy (19% and 17% misclassification rate, respectively). Furthermore, CEM without a variable selection procedure resulted in a substantially higher misclassification rate (40%).Conclusions. Comparing potential GC redistribution approaches provides guidance for obtaining better estimates of cause-of-death distribution and highlights the significance of MCD information for vital registration system reform.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.