A Curriculum Batching Strategy for Automatic ICD Coding with Deep Multi-Label Classification Models.

Yaqiang Wang,Hongping Shu,Xuechao Hao,Tao Zhu,Xu Han

doi:10.3390/healthcare10122397

Abstract

The International Classification of Diseases (ICD) has an important role in building applications for clinical medicine. Extremely large ICD coding label sets and imbalanced label distribution bring the problem of inconsistency between the local batch data distribution and the global training data distribution into the minibatch gradient descent (MBGD)-based training procedure for deep multi-label classification models for automatic ICD coding. The problem further leads to an overfitting issue. In order to improve the performance and generalization ability of the deep learning automatic ICD coding model, we proposed a simple and effective curriculum batching strategy in this paper for improving the MBGD-based training procedure. This strategy generates three batch sets offline through applying three predefined sampling algorithms. These batch sets satisfy a uniform data distribution, a shuffling data distribution and the original training data distribution, respectively, and the learning tasks corresponding to these batch sets range from simple to complex. Experiments show that, after replacing the original shuffling algorithm-based batching strategy with the proposed curriculum batching strategy, the performance of the three investigated deep multi-label classification models for automatic ICD coding all have dramatic improvements. At the same time, the models avoid the overfitting issue and all show better ability to learn the long-tailed label information. The performance is also better than a SOTA label set reconstruction model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Curriculum Batching Strategy for Automatic ICD Coding with Deep Multi-Label Classification Models.

Abstract

Talk to us

Similar Papers

More From: Healthcare (Basel, Switzerland)

Lead the way for us

Journal: Healthcare (Basel, Switzerland)	Publication Date: Nov 29, 2022
License type: CC BY 4.0

Similar Papers

A Pseudo Label-Wise Attention Network for Automatic ICD Coding.
Yifan Wu ... Ying Yu
IEEE Journal of Biomedical and Health Informatics | VOL. 26
Yifan Wu, et. al.Yifan Wu ... Ying Yu
01 Oct 2022
IEEE Journal of Biomedical and Health Informatics | VOL. 26

KAICD: A knowledge attention-based deep learning framework for automatic ICD coding
Yifan Wu ... Min Li
Neurocomputing | VOL. 469
Yifan Wu, et. al.Yifan Wu ... Min Li
28 Oct 2020
Neurocomputing | VOL. 469

AnEMIC: A Framework for Benchmarking ICD Coding Models.
Juyong Kim ... Jeremy C Weiss
Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing | VOL. 2022
Juyong Kim, et. al.Juyong Kim ... Jeremy C Weiss
01 Jan 2021
01 Jan 2021

Combining transformer-based model and GCN to predict ICD codes from clinical records
Pengli Lu ... Jingjin Xue
Knowledge-Based Systems | VOL. 282
Pengli Lu, et. al.Pengli Lu ... Jingjin Xue
27 Oct 2023
Knowledge-Based Systems | VOL. 282

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Curriculum Batching Strategy for Automatic ICD Coding with Deep Multi-Label Classification Models.

Abstract

Talk to us

Similar Papers

More From: Healthcare (Basel, Switzerland)