Programming Chatbots Using Natural Language: Generating Cervical Spine MRI Impressions.

Ramin Javan,Theodore Kim,Ahmed Abdelmonem,Ahmed Ismail,Farris Jaamour,Oleksiy Melnyk,Mary Heekin

doi:10.7759/cureus.69410

Abstract

The utility of machine learning, specifically large language models (LLMs), in the medical field has gained considerable attention. However, there is a scarcity of studies that focus on the application of LLMs in generating custom subspecialty radiology impressions. The primary objective of this study is to evaluate and compare the performance of multiple LLMs in generating specialized, accurate, and clinically useful radiology impressions for degenerative cervical spine MRI reports. The study employed a comparative analysis of multiple LLMs, including OpenAI's ChatGPT-3.5 and GPT-4 (OpenAI, San Francisco, CA), Antrhopic's Claude 2 (Anthropic PBC, San Francisco, CA), Google's Bard (Google Inc., Mountain View, CA), and Meta's Llama 2 (Meta Platforms, Inc., Menlo Park, CA). This was performed during January-February 2024. These models were evaluated using a few-shot learning approach on a dataset consisting of 10 examples from 50 synthetically generated MRI reports. Performance metrics evaluated were diagnostic accuracy, stylistic accuracy, and redundancy. While Claude 2 maintained consistent high performance across 40 cases, GPT-4 required midway re-training to improve its declining scores. Both Claude 2 and GPT-4 demonstrated the ability to generate structured impressions, but Claude 2's specialized summarization capabilities provided an edge in maintaining accuracy without continuous feedback. The other LLMs' performance was subpar. The findings of this study suggest that LLMs can be a valuable tool in automating the generation of radiology impressions. Claude 2, in particular, exhibited promising results, indicating its potential for clinical implementation. However, the study also points to the necessity for further research, especially in optimizing model performance and evaluating real-world applicability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Programming Chatbots Using Natural Language: Generating Cervical Spine MRI Impressions.

Abstract

Talk to us

Similar Papers

More From: Cureus

Lead the way for us

Similar Papers

How Can IJDS Authors, Reviewers, and Editors Use (and Misuse) Generative AI?
Galit Shmueli ... Bianca Maria Colosimo
INFORMS Journal on Data Science | VOL. 2
Galit Shmueli, et. al.Galit Shmueli ... Bianca Maria Colosimo
01 Apr 2023
INFORMS Journal on Data Science | VOL. 2

Use of SNOMED CT in Large Language Models: Scoping Review.
Eunsuk Chang ... Sumi Sung
JMIR medical informatics | VOL. 12
Eunsuk Chang, et. al.Eunsuk Chang ... Sumi Sung
07 Oct 2024
JMIR medical informatics | VOL. 12

CancerGPT for few shot drug pair synergy prediction using large pretrained language models
Tianhao Li ... Yejin Kim
npj Digital Medicine | VOL. 7
Tianhao Li, et. al.Tianhao Li ... Yejin Kim
19 Feb 2024
npj Digital Medicine | VOL. 7

Exploring Large Language Models for Detecting Online Vaccine Reactions.
Sedigh Khademi ... Jim Buttery
Studies in health technology and informatics | VOL. 318
Sedigh Khademi, et. al.Sedigh Khademi ... Jim Buttery
24 Sep 2024
Studies in health technology and informatics | VOL. 318

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Programming Chatbots Using Natural Language: Generating Cervical Spine MRI Impressions.

Abstract

Talk to us

Similar Papers

More From: Cureus