Abstract

BackgroundA verbal autopsy (VA) is a post-hoc written interview report of the symptoms preceding a person’s death in cases where no official cause of death (CoD) was determined by a physician. Current leading automated VA coding methods primarily use structured data from VAs to assign a CoD category. We present a method to automatically determine CoD categories from VA free-text narratives alone.MethodsAfter preprocessing and spelling correction, our method extracts word frequency counts from the narratives and uses them as input to four different machine learning classifiers: naïve Bayes, random forest, support vector machines, and a neural network.ResultsFor individual CoD classification, our best classifier achieves a sensitivity of.770 for adult deaths for 15 CoD categories (as compared to the current best reported sensitivity of.57), and.662 with 48 WHO categories. When predicting the CoD distribution at the population level, our best classifier achieves.962 cause-specific mortality fraction accuracy for 15 categories and.908 for 48 categories, which is on par with leading CoD distribution estimation methods.ConclusionsOur narrative-based machine learning classifier performs as well as classifiers based on structured data at the individual level. Moreover, our method demonstrates that VA narratives provide important information that can be used by a machine learning system for automated CoD classification. Unlike the structured questionnaire-based methods, this method can be applied to any verbal autopsy dataset, regardless of the collection process or country of origin.

Highlights

  • A verbal autopsy (VA) is a post-hoc written interview report of the symptoms preceding a person’s death in cases where no official cause of death (CoD) was determined by a physician

  • In comparison to our model’s sensitivity of .770 for adult deaths and .695 for child deaths, Miasnikof et al [17] reported a mean sensitivity of .57 on MDS checklist data from child and adult deaths with their naïve Bayes classifier and 16 CoD categories. They compared their results to InterVA-4 on the Million Death Study data, which achieved .43, and the Tariff Method, which achieved .50 sensitivity

  • We have shown that a variety of narrative-based machine learning classifiers can be used for automated VA coding

Read more

Summary

Introduction

A verbal autopsy (VA) is a post-hoc written interview report of the symptoms preceding a person’s death in cases where no official cause of death (CoD) was determined by a physician. Jeblee et al BMC Medical Informatics and Decision Making (2019) 19:127 who die at home without medical attention (such as education level, access to hospital care, types of pathogens, etc.) [3, 5, 6]. For this reason, physician-coded VAs are often used for training and testing automated CoD coding methods

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.