Predicting Clinical Events Based on Raw Text: From Bag-of-Words to Attention-Based Transformers.

Dmitri Roussinov,Andrew Patterson,Christopher Sainsbury,Andrew Conkie

doi:10.3389/fdgth.2021.810260

Dmitri Roussinov, Andrew Patterson + Show 2 more

Open Access

https://doi.org/10.3389/fdgth.2021.810260

Copy DOI

Abstract

Identifying which patients are at higher risks of dying or being re-admitted often happens to be resource- and life- saving, thus is a very important and challenging task for healthcare text analytics. While many successful approaches exist to predict such clinical events based on categorical and numerical variables, a large amount of health records exists in the format of raw text such as clinical notes or discharge summaries. However, the text-analytics models applied to free-form natural language found in those notes are lagging behind the break-throughs happening in the other domains and remain to be primarily based on older bag-of-words technologies. As a result, they rarely reach the accuracy level acceptable for the clinicians. In spite of their success in other domains, the superiority of deep neural approaches over classical bags of words for this task has not yet been convincingly demonstrated. Also, while some successful experiments have been reported, the most recent break-throughs due to the pre-trained language models have not yet made their ways into the medical domain. Using a publicly available healthcare dataset, we have explored several classification models to predict patients' re-admission or a fatality based on their discharge summaries and established that 1) The performance of the neural models used in our experiments convincingly exceeds those based on bag-of-words by several percentage points as measured by the standard metrics. 2) This allows us to achieve the accuracy typically acceptable by the clinicians as of practical use (area under the ROC curve above 0.70) for the majority of our prediction targets. 3) While the pre-trained attention-based transformer performed only on par with the model that averages word embeddings when applied to full length discharge summaries, the transformer still handles shorter text segments substantially better, at times with the margin of 0.04 in the area under the ROC curve. Thus, our findings extend the success of pre-trained language models reported in other domains to the task of clinical event prediction, and likely to other text-classification tasks in the healthcare analytics domain. 4) We suggest several models to overcome the transformers' major drawback (their input size limitation), and confirm that this is crucial to achieve their top performance. Our modifications are domain agnostic, and thus can be applied in other applications where the text inputs exceed 200 words. 5) We have successfully demonstrated how non-text attributes (such as patient age, demographics, type of admission etc.) can be combined with text to gain additional improvements for several prediction targets. We include extensive ablation studies showing the impact of the training size, and highlighting the tradeoffs between the performance and the resources needed.

Highlights

Identification of patients who are likely to be readmitted or at higher risk of future complications can provide significant benefits for both patients and medical providers in terms of reducing heath risks, maintaining patients’ quality of life and avoiding the markers of substandard health-care
electronic health records (EHRs) contain a wealth of information including patient demographics, laboratory test, prescriptions, radiological images, and clinical notes written by attending physicians
Using a publicly available dataset with discharge summaries, we have adapted and compared several text classification models to predict readmission or a fatality at various time intervals and established that: 1) The performance of the deep neural models that we have tested exceeds those based on older but still currently dominant “bags of words” approaches by several percentage points. We believe that this finding is a major testament to the success of deep learning models, and to the use of longer texts for clinical event predictions, which the prior work has not yet convincingly demonstrated

Summary

INTRODUCTION

Identification of patients who are likely to be readmitted or at higher risk of future complications can provide significant benefits for both patients and medical providers in terms of reducing heath risks, maintaining patients’ quality of life and avoiding the markers of substandard health-care. Using a publicly available dataset with discharge summaries, we have adapted and compared several text classification models to predict readmission or a fatality at various time intervals and established that: 1) The performance of the deep neural models that we have tested exceeds those based on older but still currently dominant “bags of words” approaches by several percentage points. We believe that this finding is a major testament to the success of deep learning models, and to the use of longer texts for clinical event predictions, which the prior work has not yet convincingly demonstrated. The section overviews the related works, followed by the description of the models used, empirical testing and conclusions

Clinical Event Prediction

Pre-trained Transformers

THE MODELS EXPLORED

Bag-of-Words

Mean-Pooling N-Gram Embeddings

Attention-Based Transformer

Resolving Transformer Input Size Limit

The Datasets

Prediction Targets

Metrics

Hyperparameters

Implementation

Results

Ablation Studies

Combining With Non-text Attributes

CONCLUSIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in digital health	Publication Date: Feb 21, 2022
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Predicting Clinical Events Based on Raw Text: From Bag-of-Words to Attention-Based Transformers.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in digital health

Lead the way for us

Similar Papers

Medical text classification based on the discriminative pre-training model and prompt-tuning.
Yu Wang ... Yuan Wang
DIGITAL HEALTH | VOL. 9
Yu Wang, et. al.Yu Wang ... Yuan Wang
01 Jan 2023
DIGITAL HEALTH | VOL. 9

Neural Transfer Learning For Vietnamese Sentiment Analysis Using Pre-trained Contextual Language Models
An Pha Le ... Tran Vu Pham
-
An Pha Le, et. al.An Pha Le ... Tran Vu Pham
16 Dec 2021
16 Dec 2021

A Multi-tasking and Multi-stage Chinese Minority Pre-trained Language Model
Bin Li ... Bin Sun
-
Bin Li, et. al.Bin Li ... Bin Sun
01 Jan 2021
01 Jan 2021

Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias
Anoop K ... Lajish V L
-
Anoop K, et. al. Anoop K ... Lajish V L
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predicting Clinical Events Based on Raw Text: From Bag-of-Words to Attention-Based Transformers.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in digital health