Structuring medication signeturs as a language regression task: comparison of zero- and few-shot GPT with fine-tuned models.

Augusto Garcia-Agundez,Jinoos Yazdany,Jing Li,Gabriela Schmajuk,Milena Gianfrancesco,Julia L Kay,Angela Hu,Baljeet Rai

doi:10.1093/jamiaopen/ooae051

Abstract

Electronic health record textual sources such as medication signeturs (sigs) contain valuable information that is not always available in structured form. Commonly processed through manual annotation, this repetitive and time-consuming task could be fully automated using large language models (LLMs). While most sigs include simple instructions, some include complex patterns. We aimed to compare the performance of GPT-3.5 and GPT-4 with smaller fine-tuned models (ClinicalBERT, BlueBERT) in extracting the average daily dose of 2 immunomodulating medications with frequent complex sigs: hydroxychloroquine, and prednisone. Using manually annotated sigs as the gold standard, we compared the performance of these models in 702 hydroxychloroquine and 22104 prednisone prescriptions. GPT-4 vastly outperformed all other models for this task at any level of in-context learning. With 100 in-context examples, the model correctly annotates 94% of hydroxychloroquine and 95% of prednisone sigs to within 1 significant digit. Error analysis conducted by 2 additional manual annotators on annotator-model disagreements suggests that the vast majority of disagreements are model errors. Many model errors relate to ambiguous sigs on which there was also frequent annotator disagreement. Paired with minimal manual annotation, GPT-4 achieved excellent performance for language regression of complex medication sigs and vastly outperforms GPT-3.5, ClinicalBERT, and BlueBERT. However, the number of in-context examples needed to reach maximum performance was similar to GPT-3.5. LLMs show great potential to rapidly extract structured data from sigs in no-code fashion for clinical and research applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: JAMIA open	Publication Date: Apr 8, 2024
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Structuring medication signeturs as a language regression task: comparison of zero- and few-shot GPT with fine-tuned models.

Abstract

Talk to us

Similar Papers

More From: JAMIA open

Lead the way for us

Similar Papers

227 Spine-tuned Natural Language Models and Bespoke Regular Expression Classifiers for Automated Spinal Surgery Registry Development
Daniel Alexander Alber ... Eric Karl Oermann
Neurosurgery | VOL. 70
Daniel Alexander Alber, et. al.Daniel Alexander Alber ... Eric Karl Oermann
01 Apr 2024
Neurosurgery | VOL. 70

DTT: An Example-Driven Tabular Transformer for Joinability by Leveraging Large Language Models
Arash Dargahi Nobari ... Davood Rafiei
Proceedings of the ACM on Management of Data | VOL. 2
Arash Dargahi Nobari, et. al.Arash Dargahi Nobari ... Davood Rafiei
12 Mar 2024
Proceedings of the ACM on Management of Data | VOL. 2

CancerGPT for few shot drug pair synergy prediction using large pretrained language models
Tianhao Li ... Yejin Kim
npj Digital Medicine | VOL. 7
Tianhao Li, et. al.Tianhao Li ... Yejin Kim
19 Feb 2024
npj Digital Medicine | VOL. 7

Fine-tuning large language models for chemical text mining.
Mingyue Zheng ... Jiacheng Xiong
Chemical science | VOL. 15
Mingyue Zheng, et. al.Mingyue Zheng ... Jiacheng Xiong
01 Jan 2024
Chemical science | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Structuring medication signeturs as a language regression task: comparison of zero- and few-shot GPT with fine-tuned models.

Abstract

Talk to us

Similar Papers

More From: JAMIA open