Extreme Multi-label Classification Research Articles

Encouraged by the success of pretrained Transformer models in many natural language processing tasks, their use for International Classification of Diseases (ICD) coding tasks is now actively being explored. In this study, we investigated two existing Transformer-based models (PLM-ICD and XR-Transformer) and proposed a novel Transformer-based model (XR-LAT), aiming to address the extreme label set and long text classification challenges that are posed by automated ICD coding tasks. The Transformer-based model PLM-ICD, which currently holds the state-of-the-art (SOTA) performance on the ICD coding benchmark datasets MIMIC-III and MIMIC-II, was selected as our baseline model for further optimisation on both datasets. In addition, we extended the capabilities of the leading model in the general extreme multi-label text classification domain, XR-Transformer, to support longer sequences and trained it on both datasets. Moreover, we proposed a novel model, XR-LAT, which was also trained on both datasets. XR-LAT is a recursively trained model chain on a predefined hierarchical code tree with label-wise attention, knowledge transferring and dynamic negative sampling mechanisms. Our optimised PLM-ICD models, which were trained with longer total and chunk sequence lengths, significantly outperformed the current SOTA PLM-ICD models, and achieved the highest micro-F1 scores of 60.8 % and 50.9 % on MIMIC-III and MIMIC-II, respectively. The XR-Transformer model, although SOTA in the general domain, did not perform well across all metrics. The best XR-LAT based models obtained results that were competitive with the current SOTA PLM-ICD models, including improving the macro-AUC by 2.1 % and 5.1 % on MIMIC-III and MIMIC-II, respectively. Our optimised PLM-ICD models are the new SOTA models for automated ICD coding on both datasets, while our novel XR-LAT models perform competitively with the previous SOTA PLM-ICD models.

Read full abstract

BackgroundThe tenth revision of the International Classification of Diseases (ICD-10) is widely used for epidemiological research and health management. The clinical modification (CM) and procedure coding system (PCS) of ICD-10 were developed to describe more clinical details with increasing diagnosis and procedure codes and applied in disease-related groups for reimbursement. The expansion of codes made the coding time-consuming and less accurate. The state-of-the-art model using deep contextual word embeddings was used for automatic multilabel text classification of ICD-10. In addition to input discharge diagnoses (DD), the performance can be improved by appropriate preprocessing methods for the text from other document types, such as medical history, comorbidity and complication, surgical method, and special examination.ObjectiveThis study aims to establish a contextual language model with rule-based preprocessing methods to develop the model for ICD-10 multilabel classification.MethodsWe retrieved electronic health records from a medical center. We first compared different word embedding methods. Second, we compared the preprocessing methods using the best-performing embeddings. We compared biomedical bidirectional encoder representations from transformers (BioBERT), clinical generalized autoregressive pretraining for language understanding (Clinical XLNet), label tree-based attention-aware deep model for high-performance extreme multilabel text classification (AttentionXLM), and word-to-vector (Word2Vec) to predict ICD-10-CM. To compare different preprocessing methods for ICD-10-CM, we included DD, medical history, and comorbidity and complication as inputs. We compared the performance of ICD-10-CM prediction using different preprocesses, including definition training, external cause code removal, number conversion, and combination code filtering. For the ICD-10 PCS, the model was trained using different combinations of DD, surgical method, and key words of special examination. The micro F1 score and the micro area under the receiver operating characteristic curve were used to compare the model’s performance with that of different preprocessing methods.ResultsBioBERT had an F1 score of 0.701 and outperformed other models such as Clinical XLNet, AttentionXLM, and Word2Vec. For the ICD-10-CM, the model had an F1 score that significantly increased from 0.749 (95% CI 0.744-0.753) to 0.769 (95% CI 0.764-0.773) with the ICD-10 definition training, external cause code removal, number conversion, and combination code filter. For the ICD-10-PCS, the model had an F1 score that significantly increased from 0.670 (95% CI 0.663-0.678) to 0.726 (95% CI 0.719-0.732) with a combination of discharge diagnoses, surgical methods, and key words of special examination. With our preprocessing methods, the model had the highest area under the receiver operating characteristic curve of 0.853 (95% CI 0.849-0.855) and 0.831 (95% CI 0.827-0.834) for ICD-10-CM and ICD-10-PCS, respectively.ConclusionsThe performance of our model with the pretrained contextualized language model and rule-based preprocessing method is better than that of the state-of-the-art model for ICD-10-CM or ICD-10-PCS. This study highlights the importance of rule-based preprocessing methods based on coder coding rules.

Read full abstract

Extreme Multi-label Classification Research Articles

Related Topics

Articles published on Extreme Multi-label Classification

BoostXML: Gradient Boosting for Extreme Multilabel Text Classification With Tail Labels.

ICDXML: enhancing ICD coding with probabilistic label trees and dynamic semantic representations

TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text Classification

Meta-classifier free negative sampling for extreme multilabel classification

Automated ICD coding using extreme multi-label long text transformer-based models

Fast block-wise partitioning for extreme multi-label classification

CRAT-XML: Contrastive Representation Adversarial Training for Extremely Multi-Label Text Classification

GUDN: A novel guide network with label reinforcement strategy for extreme multi-label text classification

XRR: Extreme multi-label text classification with candidate retrieving and deep ranking

Multi-Aspect co-Attentional Collaborative Filtering for extreme multi-label text classification

The Emerging Trends of Multi-Label Learning.

Speeding-up one-versus-all training for extreme classification via mean-separating initialization

Automatic International Classification of Diseases Coding System: Deep Contextualized Language Model With Rule-Based Approaches.

Impact of preprocessing and word embedding on extreme multi-label patent classification tasks

Implementation of specialised attention mechanisms: ICD-10 classification of Gastrointestinal discharge summaries in English, Spanish and Swedish

BGNN-XML: Bilateral Graph Neural Networks for Extreme Multi-label Text Classification

LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification

Learning from eXtreme Bandit Feedback

Fine-grained Generalization Analysis of Vector-Valued Learning

Label-Aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Extreme Multi-label Classification Research Articles

Related Topics

Articles published on Extreme Multi-label Classification

BoostXML: Gradient Boosting for Extreme Multilabel Text Classification With Tail Labels.

ICDXML: enhancing ICD coding with probabilistic label trees and dynamic semantic representations

TLC-XML: Transformer with Label Correlation for Extreme Multi-label Text Classification

Meta-classifier free negative sampling for extreme multilabel classification

Automated ICD coding using extreme multi-label long text transformer-based models

Fast block-wise partitioning for extreme multi-label classification

CRAT-XML: Contrastive Representation Adversarial Training for Extremely Multi-Label Text Classification

GUDN: A novel guide network with label reinforcement strategy for extreme multi-label text classification

XRR: Extreme multi-label text classification with candidate retrieving and deep ranking

Multi-Aspect co-Attentional Collaborative Filtering for extreme multi-label text classification

The Emerging Trends of Multi-Label Learning.

Speeding-up one-versus-all training for extreme classification via mean-separating initialization

Automatic International Classification of Diseases Coding System: Deep Contextualized Language Model With Rule-Based Approaches.

Impact of preprocessing and word embedding on extreme multi-label patent classification tasks

Implementation of specialised attention mechanisms: ICD-10 classification of Gastrointestinal discharge summaries in English, Spanish and Swedish

BGNN-XML: Bilateral Graph Neural Networks for Extreme Multi-label Text Classification

LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification

Learning from eXtreme Bandit Feedback

Fine-grained Generalization Analysis of Vector-Valued Learning

Label-Aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification