KAICD: A knowledge attention-based deep learning framework for automatic ICD coding

Yifan Wu,Min Zeng,Zhihui Fei,Ying Yu,Fang-Xiang Wu,Min Li

doi:10.1016/j.neucom.2020.05.115

Abstract

Automatic International Classification of Diseases (ICD) coding is an important task in the future of artificial intelligence healthcare. In recent years, a lot of traditional machine learning-based methods have been proposed, and they achieved good results on this task. However, these traditional machine learning-based methods for automatic ICD coding only focus on the semantic features of clinical notes and ignore the feature extraction of ICD titles that are the descriptions of ICD codes. In this paper, we propose a knowledge attention-based deep learning framework called KAICD for automatic ICD coding. KAICD makes full use of the clinic notes and the ICD titles. The semantic features of clinic notes are extracted by a multi-scale convolutional neural network. For ICD titles, we use attention-based Bidirectional Gated Recurrent Unit (Bi-GRU) to build a knowledge database, which can offer additional information. Depending on input clinic notes, we can use the attention mechanism to obtain different knowledge vectors from the knowledge database where some ICD titles are more relevant to the input clinic notes. Last, we concatenate the knowledge vectors and the semantic features of clinic notes, and use them for the final prediction. KAICD is tested on a public dataset Medical Information Mart for Intensive Care III (MIMIC III); it achieves micro-precision of 0.502, micro-recall of 0.428, and micro-f1 of 0.462, which outperforms other competing methods. Furthermore, the results of the ablation study show that the knowledge database of ICD titles learned by the attention-based Bi-GRU enhances the feature expression and improves the prediction performance.

Full Text