Abstract

BackgroundMachine learning (ML) algorithms have been successfully employed for prediction of outcomes in clinical research. In this study, we have explored the application of ML-based algorithms to predict cause of death (CoD) from verbal autopsy records available through the Million Death Study (MDS).MethodsFrom MDS, 18826 unique childhood deaths at ages 1–59 months during the time period 2004–13 were selected for generating the prediction models of which over 70% of deaths were caused by six infectious diseases (pneumonia, diarrhoeal diseases, malaria, fever of unknown origin, meningitis/encephalitis, and measles). Six popular ML-based algorithms such as support vector machine, gradient boosting modeling, C5.0, artificial neural network, k-nearest neighbor, classification and regression tree were used for building the CoD prediction models.ResultsSVM algorithm was the best performer with a prediction accuracy of over 0.8. The highest accuracy was found for diarrhoeal diseases (accuracy = 0.97) and the lowest was for meningitis/encephalitis (accuracy = 0.80). The top signs/symptoms for classification of these CoDs were also extracted for each of the diseases. A combination of signs/symptoms presented by the deceased individual can effectively lead to the CoD diagnosis.ConclusionsOverall, this study affirms that verbal autopsy tools are efficient in CoD diagnosis and that automated classification parameters captured through ML could be added to verbal autopsies to improve classification of causes of death.

Highlights

  • Machine learning (ML) algorithms have been successfully employed for prediction of outcomes in clinical research

  • Two independent physicians reviewed all the completed RHIME forms and assigned the underlying cause of death (CoD) according to the International Classification of Diseases, tenth revision (ICD-10) [15], and included a number of “keywords” in the record, which are signs and symptoms observed in the verbal autopsy (VA) that support their diagnosis

  • Pneumonia and diarrhoeal diseases are known to be major cause of childhood mortality in India, especially in poorer communities [39] and this is reflected in the Million Death Study (MDS) data

Read more

Summary

Introduction

Machine learning (ML) algorithms have been successfully employed for prediction of outcomes in clinical research. We have explored the application of ML-based algorithms to predict cause of death (CoD) from verbal autopsy records available through the Million Death Study (MDS). A second reason for poor documentation of death is because unlike birth, family members are not sufficiently incentivised to register death This gap in death records and associated data is a serious impediment in assessing disease patterns and public health needs of a country. To address this gap, the Million Death Study (MDS) was initiated in India to quantify premature mortality through verbal autopsy (VA) [1, 2] in a nationally representative sample of homes. In cases, where the CoD assignment for a record does not match for the two physicians, it is adjudicated by a third senior physician

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call