Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

Yu Gu,Tristan Naumann,Robert Tinn,Naoto Usuyama,Hoifung Poon,Michael Lucas,Xiaodong Liu,Jianfeng Gao,Hao Cheng

doi:10.1145/3458754

Abstract

Pretraining large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. However, most pretraining efforts focus on general domain corpora, such as newswire and Web. A prevailing assumption is that even domain-specific pretraining can benefit by starting from general-domain language models. In this article, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models. To facilitate this investigation, we compile a comprehensive biomedical NLP benchmark from publicly available datasets. Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks, leading to new state-of-the-art results across the board. Further, in conducting a thorough evaluation of modeling choices, both for pretraining and task-specific fine-tuning, we discover that some common practices are unnecessary with BERT models, such as using complex tagging schemes in named entity recognition. To help accelerate research in biomedical NLP, we have released our state-of-the-art pretrained and task-specific models for the community, and created a leaderboard featuring our BLURB benchmark (short for Biomedical Language Understanding & Reasoning Benchmark) at https://aka.ms/BLURB .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Computing for Healthcare

Lead the way for us

Journal: ACM Transactions on Computing for Healthcare	Publication Date: Oct 15, 2021
Citations: 537

Similar Papers

Exploring the Latest Highlights in Medical Natural Language Processing across Multiple Languages: A Survey.
Anastassia Shaitarova ... Michael Krauthammer
Yearbook of Medical Informatics | VOL. 32
Anastassia Shaitarova, et. al.Anastassia Shaitarova ... Michael Krauthammer
01 Aug 2023
Yearbook of Medical Informatics | VOL. 32

StaResGRU-CNN with CMedLMs: A stacked residual GRU-CNN with pre-trained biomedical language models for predictive intelligence
Pin Ni ... Victor Chang
Applied Soft Computing | VOL. 113
Pin Ni, et. al.Pin Ni ... Victor Chang
13 Oct 2021
Applied Soft Computing | VOL. 113

Benchmarking for biomedical natural language processing tasks with a domain specific ALBERT
Usman Naseem ... Adam G Dunn
BMC Bioinformatics | VOL. 23
Usman Naseem, et. al.Usman Naseem ... Adam G Dunn
21 Apr 2022
BMC Bioinformatics | VOL. 23

Critical assessment of transformer-based AI models for German clinical notes
Manuel Lentzen ... Martin Hofmann-Apitius
JAMIA Open | VOL. 5
Manuel Lentzen, et. al.Manuel Lentzen ... Martin Hofmann-Apitius
04 Oct 2022
JAMIA Open | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Computing for Healthcare