Abstract
ObjectivesTo develop an automated international classification of diseases (ICD) coding tool using natural language processing (NLP) and discharge summary texts from Thailand. Materials and methodsThe development phase included 15,329 discharge summaries from Ramathibodi Hospital from January 2015 to December 2020. The external validation phase included Medical Information Mart for Intensive Care III (MIMIC-III) data. Three algorithms were developed: naïve Bayes with term frequency-inverse document frequency (NB-TF-IDF), convolutional neural network with neural word embedding (CNN-NWE), and CNN with PubMedBERT (CNN-PubMedBERT). In addition, two state-of-the-art models were also considered; convolutional attention for multi-label classification (CAML) and pretrained language models for automatic ICD coding (PLM-ICD). ResultsThe CNN-PubMedBERT model provided average micro- and macro-area under precision-recall curve (AUPRC) of 0.6605 and 0.5538, which outperformed CNN-NWE (0.6528 and 0.5564), NB-TF-IDF (0.4441 and 0.3562), and CAML (0.6257 and 0.4964), with corresponding differences of (0.0077 and −0.0026), (0.2164 and 0.1976), and (0.0348 and 0.0574), respectively. However, CNN-PubMedBERT performed less well relative to PLM-ICD, with corresponding AUPRCs of 0.7202 and 0.5865. The CNN-PubMedBERT model was externally validated using two subsets of MIMIC-III; MIMIC-ICD-10, and MIMIC-ICD-9 datasets, which contained 40,923 and 31,196 discharge summaries. The average micro-AUPRCs were 0.3745, 0.6878, and 0.6699, corresponding to directly predictive MIMIC-ICD-10, MIMIC-ICD-10 fine-tuning, and MIMIC-ICD-9 fine-tuning approaches; the average macro-AUPRCs for the corresponding models were 0.2819, 0.4219 and 0.5377, respectively. DiscussionCNN-PubMedBERT performed second-best to PLM-ICD, with considerable variation observed between average micro- and macro-AUPRC, especially for external validation, generally indicating good overall prediction but limited predictive value for small sample sizes. External validation in a US cohort demonstrated a higher level of model prediction performance. ConclusionBoth PLM-ICD and CNN-PubMedBERT models may provide useful tools for automated ICD-10 coding. Nevertheless, further evaluation and validation within Thai and Asian healthcare systems may prove more informative for clinical application.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.