Abstract Background Cause-of-death (CoD) statistics are key indicators in epidemiology and public health. These statistics are derived from death certificates completed by physicians and are usually coded by official statistics authorities according to the standards of the WHO ICD-10 classification to construct time- and cross-country-comparable statistics. For coding causes of death in free-text format in France according to ICD-10, predictions by deep neural networks (DNNs) are employed in addition to fully automatic batch coding by a rule-based expert system and to interactive coding by the coding team. Methods Seq-to-seq DNNs are trained from scratch on previously coded data to ICD-10 code multiple causes and underlying causes of death. Human coding is focused on certificates with a special public health interest or with low confidence in AI prediction quality to maximize the quality of the overall statistics disseminated when human resources are limited. DNNs also directly predict multiple causes and underlying causes of death for part of the certificates. Hence, the coding campaign aims to optimally allocate a coding mode to a given certificate. Results For deaths in 2021, 63% of the certificates are automatically batch coded by the expert system, 14% by the coding team, and 23% by DNNs. Compared to a traditional campaign that would have relied on automatic batch coding and manual coding, the present campaign reaches an accuracy of 95.7% for ICD-10 coding of the underlying cause (97.3% at the European shortlist level). Conclusions The 3-coding mode approach enables timeliness of dissemination of CoD statistics and may lead in the future to more responsive surveillance. Key messages • Deep neural networks trained on already coded data can be integrated into the regular statistical production of the CoD database in a fully controlled way. • The allocation between the different modes of coding can be optimized to achieve best quality and timeliness under the constraint of limited human resources.
Read full abstract