A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports.

Madhumita Sushil,Travis Zack,Travis Zack,Divneet Mandair,Divneet Mandair,Zhiwei Zheng,Ahmed Wali,Yan-Ning Yu,Yuwei Quan,Dmytro Lituiev,Atul J Butte,Atul J Butte,Atul J Butte,Atul J Butte

doi:10.1093/jamia/ocae146

Madhumita Sushil, Travis Zack + Show 12 more

Open Access

https://doi.org/10.1093/jamia/ocae146

Copy DOI

Abstract

Although supervised machine learning is popular for information extraction from clinical notes, creating large annotated datasets requires extensive domain expertise and is time-consuming. Meanwhile, large language models (LLMs) have demonstrated promising transfer learning capability. In this study, we explored whether recent LLMs could reduce the need for large-scale data annotations. We curated a dataset of 769 breast cancer pathology reports, manually labeled with 12 categories, to compare zero-shot classification capability of the following LLMs: GPT-4, GPT-3.5, Starling, and ClinicalCamel, with task-specific supervised classification performance of 3 models: random forests, long short-term memory networks with attention (LSTM-Att), and the UCSF-BERT model. Across all 12 tasks, the GPT-4 model performed either significantly better than or as well as the best supervised model, LSTM-Att (average macro F1-score of 0.86 vs 0.75), with advantage on tasks with high label imbalance. Other LLMs demonstrated poor performance. Frequent GPT-4 error categories included incorrect inferences from multiple samples and from history, and complex task design, and several LSTM-Att errors were related to poor generalization to the test set. On tasks where large annotated datasets cannot be easily collected, LLMs can reduce the burden of data labeling. However, if the use of LLMs is prohibitive, the use of simpler models with large annotated datasets can provide comparable results. GPT-4 demonstrated the potential to speed up the execution of clinical NLP studies by reducing the need for large annotated datasets. This may increase the utilization of NLP-based variables and outcomes in clinical studies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of the American Medical Informatics Association : JAMIA	Publication Date: Jun 20, 2024
Citations: 2	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports.

Abstract

Talk to us

Similar Papers

More From: Journal of the American Medical Informatics Association : JAMIA

Lead the way for us

Similar Papers

Large Language Models Can Enable Inductive Thematic Analysis of a Social Media Corpus in a Single Prompt: Human Validation Study.
Michael S Deiner ... Urmimala Sarkar
JMIR infodemiology | VOL. 4
Michael S Deiner, et. al.Michael S Deiner ... Urmimala Sarkar
29 Aug 2024
JMIR infodemiology | VOL. 4

A Review of Current Trends, Techniques, and Challenges in Large Language Models (LLMs)
Rajvardhan Patil ... Venkat Gudivada
Applied Sciences | VOL. 14
Rajvardhan Patil, et. al.Rajvardhan Patil ... Venkat Gudivada
01 Mar 2024
Applied Sciences | VOL. 14

Approximate nearest neighbor search to support manual image annotation of large domain-specific datasets
Bastiaan J Boom ... Robert B Fisher
-
Bastiaan J Boom, et. al.Bastiaan J Boom ... Robert B Fisher
15 Jul 2013
15 Jul 2013

Large language models for precision oncology: Clinical decision support through expert-guided learning.
Jacqueline Lammert ... Kristina Schwamborn
Journal of Clinical Oncology | VOL. 42
Jacqueline Lammert, et. al.Jacqueline Lammert ... Kristina Schwamborn
01 Jun 2024
Journal of Clinical Oncology | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports.

Abstract

Talk to us

Similar Papers

More From: Journal of the American Medical Informatics Association : JAMIA