A comparative study of pretrained language models for long clinical text.

Yikuan Li,Faraz S Ahmad,Hanyin Wang,Yuan Luo,Ramsey M Wehbe

doi:10.1093/jamia/ocac225

Abstract

Clinical knowledge-enriched transformer models (eg, ClinicalBERT) have state-of-the-art results on clinical natural language processing (NLP) tasks. One of the core limitations of these transformer models is the substantial memory consumption due to their full self-attention mechanism, which leads to the performance degradation in long clinical texts. To overcome this, we propose to leverage long-sequence transformer models (eg, Longformer and BigBird), which extend the maximum input sequence length from 512 to 4096, to enhance the ability to model long-term dependencies in long clinical texts. Inspired by the success of long-sequence transformer models and the fact that clinical notes are mostly long, we introduce 2 domain-enriched language models, Clinical-Longformer and Clinical-BigBird, which are pretrained on a large-scale clinical corpus. We evaluate both language models using 10 baseline tasks including named entity recognition, question answering, natural language inference, and document classification tasks. The results demonstrate that Clinical-Longformer and Clinical-BigBird consistently and significantly outperform ClinicalBERT and other short-sequence transformers in all 10 downstream tasks and achieve new state-of-the-art results. Our pretrained language models provide the bedrock for clinical NLP using long texts. We have made our source code available at https://github.com/luoyuanlab/Clinical-Longformer, and the pretrained models available for public download at: https://huggingface.co/yikuan8/Clinical-Longformer. This study demonstrates that clinical knowledge-enriched long-sequence transformers are able to learn long-term dependencies in long clinical text. Our methods can also inspire the development of other domain-enriched long-sequence transformers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A comparative study of pretrained language models for long clinical text.

Abstract

Talk to us

Similar Papers

More From: Journal of the American Medical Informatics Association

Lead the way for us

Journal: Journal of the American Medical Informatics Association	Publication Date: Nov 30, 2022
Citations: 38

Similar Papers

Pre-trained Language Models for Tagalog with Multi-source Data
Shengyi Jiang ... Nankai Lin
-
Shengyi Jiang, et. al.Shengyi Jiang ... Nankai Lin
01 Jan 2020
01 Jan 2020

Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias
Anoop K ... Lajish V L
-
Anoop K, et. al. Anoop K ... Lajish V L
01 Jan 2021
01 Jan 2021

HinPLMs: Pre-trained Language Models for Hindi
Xixuan Huang ... Suifu Gan
-
Xixuan Huang, et. al.Xixuan Huang ... Suifu Gan
11 Dec 2021
11 Dec 2021

An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.
Sonish Sivarajkumar ... Yanshan Wang
JMIR Medical Informatics | VOL. 12
Sonish Sivarajkumar, et. al.Sonish Sivarajkumar ... Yanshan Wang
08 Apr 2024
JMIR Medical Informatics | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A comparative study of pretrained language models for long clinical text.

Abstract

Talk to us

Similar Papers

More From: Journal of the American Medical Informatics Association