Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models.

Minbyul Jeong,Jiwoong Sohn,Mujeen Sung,Jaewoo Kang

doi:10.1093/bioinformatics/btae238

Minbyul Jeong, Jiwoong Sohn + Show 2 more

Open Access

https://doi.org/10.1093/bioinformatics/btae238

Copy DOI

Export

Save

Cite

Journal: Bioinformatics (Oxford, England)	Publication Date: Jun 28, 2024
Citations: 7	License type: CC BY 4.0

Abstract
Full-Text
Similar Papers

Abstract

Listen

Recent proprietary large language models (LLMs), such as GPT-4, have achieved a milestone in tackling diverse challenges in the biomedical domain, ranging from multiple-choice questions to long-form generations. To address challenges that still cannot be handled with the encoded knowledge of LLMs, various retrieval-augmented generation (RAG) methods have been developed by searching documents from the knowledge corpus and appending them unconditionally or selectively to the input of LLMs for generation. However, when applying existing methods to different domain-specific problems, poor generalization becomes apparent, leading to fetching incorrect documents or making inaccurate judgments. In this paper, we introduce Self-BioRAG, a framework reliable for biomedical text that specializes in generating explanations, retrieving domain-specific documents, and self-reflecting generated responses. We utilize 84k filtered biomedical instruction sets to train Self-BioRAG that can assess its generated explanations with customized reflective tokens. Our work proves that domain-specific components, such as a retriever, domain-related document corpus, and instruction sets are necessary for adhering to domain-related instructions. Using three major medical question-answering benchmark datasets, experimental results of Self-BioRAG demonstrate significant performance gains by achieving a 7.2% absolute improvement on average over the state-of-the-art open-foundation model with a parameter size of 7B or less. Similarly, Self-BioRAG outperforms RAG by 8% Rouge-1 score in generating more proficient answers on two long-form question-answering benchmarks on average. Overall, we analyze that Self-BioRAG finds the clues in the question, retrieves relevant documents if needed, and understands how to answer with information from retrieved documents and encoded knowledge as a medical expert does. We release our data and code for training our framework components and model weights (7B and 13B) to enhance capabilities in biomedical and clinical domains. Self-BioRAG is available at https://github.com/dmis-lab/self-biorag.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models.

Abstract

Published Version

Talk to us

Similar Papers

More From: Bioinformatics (Oxford, England)

Lead the way for us

Similar Papers

A comprehensive evaluation of large Language models on benchmark biomedical text processing tasks
Israt Jahan ... Jimmy Xiangji Huang
Computers in biology and medicine | VOL. 171
Israt Jahan, et. al.Israt Jahan ... Jimmy Xiangji Huang
20 Feb 2024
Computers in biology and medicine | VOL. 171

Large language foundation models encode clinical radiation oncology domain knowledge: Performance on the American College of Radiology Standardized Examination.
Arturo Loaiza-Bonilla ... Cataldo Doria
Journal of Clinical Oncology | VOL. 42
Arturo Loaiza-Bonilla, et. al.Arturo Loaiza-Bonilla ... Cataldo Doria
01 Jun 2024
Journal of Clinical Oncology | VOL. 42

CARDBiomedBench: A Benchmark for Evaluating Large Language Model Performance in Biomedical Research: A novel question-and-answer benchmark designed to assess Large Language Models' comprehension of biomedical research, piloted on Neurodegenerative Diseases.
Owen Bianchi ... Faraz Faghri
bioRxiv : the preprint server for biology | VOL. -
Owen Bianchi, et. al.Owen Bianchi ... Faraz Faghri
21 Jan 2025
bioRxiv : the preprint server for biology | VOL. -

From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain
Agnese Bonfigli ... Felice Dell’Orletta
Artificial Intelligence In Medicine | VOL. 157
Agnese Bonfigli, et. al.Agnese Bonfigli ... Felice Dell’Orletta
23 Oct 2024
Artificial Intelligence In Medicine | VOL. 157

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models.

Abstract

Published Version

Talk to us

Similar Papers

More From: Bioinformatics (Oxford, England)