Abstract
The identification of the underlying cause of death is a matter of primary importance and one of the most challenging issues in the setting of healthcare policy making. The World Health Organisation provides guidelines for death certificates coding using the ICD-10 classification. Guidelines can be manually applied, but there exist some coding support systems that implement them to simplify the coding work. Nevertheless, there is disparity among countries with respect to the level and the quality of death certificates registration. In this work we propose an effective supervised model based on Natural Language Processing algorithms to the aim of correctly classifying the underlying cause of death from death certificates. In our study we compared tabular representations of the death certificate, including the hierarchical path of each condition in the classification, with a novel representation consisting in translating back to their standard title the conditions expressed as ICD-10 codes. Our experimental evaluation, after training on 10.5 million certificates, reached a 99.03% accuracy, which currently outperforms state-of-the-art systems. For its practical applicability, we studied performance by classification chapter and found that accuracy is low only for chapters including very rare death causes. Finally, to show the robustness of our model, we leverage the model confidence to help identifying death certificates for which a manual coding is needed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.