CBAG: Conditional biomedical abstract generation.

Friedhelm Schwenker,Justin Sybrandt,Ilya Safro

doi:10.1371/journal.pone.0253905

Friedhelm Schwenker, Justin Sybrandt + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0253905

Copy DOI

Journal: PLOS ONE	Publication Date: Jul 6, 2021
Citations: 2	License type: CC BY 4.0

Affiliation: Clemson University, University of Delaware

Abstract

Biomedical research papers often combine disjoint concepts in novel ways, such as when describing a newly discovered relationship between an understudied gene with an important disease. These concepts are often explicitly encoded as metadata keywords, such as the author-provided terms included with many documents in the MEDLINE database. While substantial recent work has addressed the problem of text generation in a more general context, applications, such as scientific writing assistants, or hypothesis generation systems, could benefit from the capacity to select the specific set of concepts that underpin a generated biomedical text. We propose a conditional language model following the transformer architecture. This model uses the “encoder stack” to encode concepts that a user wishes to discuss in the generated text. The “decoder stack” then follows the masked self-attention pattern to perform text generation, using both prior tokens as well as the encoded condition. We demonstrate that this approach provides significant control, while still producing reasonable biomedical text.

Highlights

Scientific papers often combine a range of disconnected concepts in novel patterns, following the typical research strategies of many scientists [1]
In Multi-Conditional Language Model we describe the methodology behind the Conditional Biomedical Abstract Generation (CBAG) model, which specializes the transformer architecture for generating biomedical abstracts
While Natural Language Processing (NLP) benchmarks such as GLUE [24] and its biomedical counterpart BLUE [22] help researchers compare performance across a range tasks, we are unaware of a benchmark for the generation of biomedical abstracts

Summary

Introduction

Scientific papers often combine a range of disconnected concepts in novel patterns, following the typical research strategies of many scientists [1]. The CBAG model is a transformer featuring a shallow encoder stack to encode qualities of the condition and a deep decoder stack to produce a high quality language model We train this model using semi-supervised multi-task generative pre-training, wherein to minimize our proposed objective function, the model must predict successive tokens, parts of speech, dependency tags, as well as entity labels. Trained using MEDLINE records and informed by semi-supervised domain-specific annotations, this model captures biomedical jargon, entities, and pattern of scientific discussion. We compare this model to two instances of GPT-2, both original and finetuned, and find competitive quantitative results. We discuss these concerns further in Future Challenges and Ethical Considerations

Background

Results

Related work

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CBAG: Conditional biomedical abstract generation.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Causal relationship extraction from biomedical text using deep neural models: A comprehensive survey
Abbas Akkasi ... Mari-Francine Moens
Journal of Biomedical Informatics | VOL. 119
Abbas Akkasi, et. al.Abbas Akkasi ... Mari-Francine Moens
24 May 2021
Journal of Biomedical Informatics | VOL. 119

Biomedical text readability after hypernym substitution with fine-tuned large language models.
Karl Swanson ... Amara Tariq
PLOS digital health | VOL. 3
Karl Swanson, et. al.Karl Swanson ... Amara Tariq
16 Apr 2024
PLOS digital health | VOL. 3

Question Answering and Text Generation Using BERT and GPT-2 Model
Santoshi Kumari ... T P Pushphavati
-
Santoshi Kumari, et. al.Santoshi Kumari ... T P Pushphavati
09 Sep 2022
09 Sep 2022

Can the Transformer Be Used as a Drop-in Replacement for RNNs in Text-Generating GANs?
Kevin Blin ... Andrei Kucharavy
-
Kevin Blin, et. al.Kevin Blin ... Andrei Kucharavy
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CBAG: Conditional biomedical abstract generation.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE