Performance of two large language models for data extraction in evidence synthesis.

Amanda Konet,Ian Thomas,Gerald Gartlehner,Leila Kahwati,Rainer Hilscher,Shannon Kugley,Karen Crotty,Meera Viswanathan,Robert Chew

doi:10.1002/jrsm.1732

Abstract

Accurate data extraction is a key component of evidence synthesis and critical to valid results. The advent of publicly available large language models (LLMs) has generated interest in these tools for evidence synthesis and created uncertainty about the choice of LLM. We compare the performance of two widely available LLMs (Claude 2 and GPT-4) for extracting pre-specified data elements from 10 published articles included in a previously completed systematic review. We use prompts and full study PDFs to compare the outputs from the browser versions of Claude 2 and GPT-4. GPT-4 required use of a third-party plugin to upload and parse PDFs. Accuracy was high for Claude 2 (96.3%). The accuracy of GPT-4 with the plug-in was lower (68.8%); however, most of the errors were due to the plug-in. Both LLMs correctly recognized when prespecified data elements were missing from the source PDF and generated correct information for data elements that were not reported explicitly in the articles. A secondary analysis demonstrated that, when provided selected text from the PDFs, Claude 2 and GPT-4 accurately extracted 98.7% and 100% of the data elements, respectively. Limitations include the narrow scope of the study PDFs used, that prompt development was completed using only Claude 2, and that we cannot guarantee the open-source articles were not used to train the LLMs. This study highlights the potential for LLMs to revolutionize data extraction but underscores the importance of accurate PDF parsing. For now, it remains essential for a human investigator to validate LLM extractions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Performance of two large language models for data extraction in evidence synthesis.

Abstract

Talk to us

Similar Papers

More From: Research synthesis methods

Lead the way for us

Journal: Research synthesis methods	Publication Date: Jun 19, 2024
Citations: 1

Similar Papers

Data extraction for evidence synthesis using a large language model: A proof-of-concept study.
Gerald Gartlehner ... Shannon Kugley
Research synthesis methods | VOL. 15
Gerald Gartlehner, et. al.Gerald Gartlehner ... Shannon Kugley
03 Mar 2024
Research synthesis methods | VOL. 15

How Can IJDS Authors, Reviewers, and Editors Use (and Misuse) Generative AI?
Galit Shmueli ... Bianca Maria Colosimo
INFORMS Journal on Data Science | VOL. 2
Galit Shmueli, et. al.Galit Shmueli ... Bianca Maria Colosimo
01 Apr 2023
INFORMS Journal on Data Science | VOL. 2

Extracting accurate materials data from research papers with conversational language models and prompt engineering.
Maciej P Polak ... Dane Morgan
Nature Communications | VOL. 15
Maciej P Polak, et. al.Maciej P Polak ... Dane Morgan
21 Feb 2024
Nature Communications | VOL. 15

Performance of Large Language Models on a Neurology Board–Style Examination
Marc Cicero Schubert ... Varun Venkataramani
JAMA network open | VOL. 6
Marc Cicero Schubert, et. al.Marc Cicero Schubert ... Varun Venkataramani
07 Dec 2023
JAMA network open | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance of two large language models for data extraction in evidence synthesis.

Abstract

Talk to us

Similar Papers

More From: Research synthesis methods