Accuracy, readability, and understandability of large language models for prostate cancer information to the public.

Jacob S Hershenhouse,Michael B Eppler,Severin Rodler,Daniel Mokhtar,Conner Ganjavi,Brian Hom,Lorenzo Storino Ramacciotti,Inderbir Gill,Andrea Cocci,John Tran,Giorgio Ivan Russo,Andre Abreu,Giovanni E Cacciamani,Mihir Desai,Ryan J Davis

doi:10.1038/s41391-024-00826-y

Jacob S Hershenhouse, Michael B Eppler + Show 13 more

Open Access

https://doi.org/10.1038/s41391-024-00826-y

Copy DOI

Journal: Prostate Cancer and Prostatic Diseases	Publication Date: May 14, 2024
Citations: 4	License type: CC BY 4.0

Abstract

Generative Pretrained Model (GPT) chatbots have gained popularity since the public release of ChatGPT. Studies have evaluated the ability of different GPT models to provide information about medical conditions. To date, no study has assessed the quality of ChatGPT outputs to prostate cancer related questions from both the physician and public perspective while optimizing outputs for patient consumption. Nine prostate cancer-related questions, identified through Google Trends (Global), were categorized into diagnosis, treatment, and postoperative follow-up. These questions were processed using ChatGPT 3.5, and the responses were recorded. Subsequently, these responses were re-inputted into ChatGPT to create simplified summaries understandable at a sixth-grade level. Readability of both the original ChatGPT responses and the layperson summaries was evaluated using validated readability tools. A survey was conducted among urology providers (urologists and urologists in training) to rate the original ChatGPT responses for accuracy, completeness, and clarity using a 5-point Likert scale. Furthermore, two independent reviewers evaluated the layperson summaries on correctness trifecta: accuracy, completeness, and decision-making sufficiency. Public assessment of the simplified summaries' clarity and understandability was carried out through Amazon Mechanical Turk (MTurk). Participants rated the clarity and demonstrated their understanding through a multiple-choice question. GPT-generated output was deemed correct by 71.7% to 94.3% of raters (36 urologists, 17 urology residents) across 9 scenarios. GPT-generated simplified layperson summaries of this output was rated as accurate in 8 of 9 (88.9%) scenarios and sufficient for a patient to make a decision in 8 of 9 (88.9%) scenarios. Mean readability of layperson summaries was higher than original GPT outputs ([original ChatGPT v. simplified ChatGPT, mean (SD), p-value] Flesch Reading Ease: 36.5(9.1) v. 70.2(11.2), <0.0001; Gunning Fog: 15.8(1.7) v. 9.5(2.0), p < 0.0001; Flesch Grade Level: 12.8(1.2) v. 7.4(1.7), p < 0.0001; Coleman Liau: 13.7(2.1) v. 8.6(2.4), 0.0002; Smog index: 11.8(1.2) v. 6.7(1.8), <0.0001; Automated Readability Index: 13.1(1.4) v. 7.5(2.1), p < 0.0001). MTurk workers (n = 514) rated the layperson summaries as correct (89.5-95.7%) and correctly understood the content (63.0-87.4%). GPT shows promise for correct patient education for prostate cancer-related contents, but the technology is not designed for delivering patients information. Prompting the model to respond with accuracy, completeness, clarity and readability may enhance its utility when used for GPT-powered medical chatbots.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Accuracy, readability, and understandability of large language models for prostate cancer information to the public.

Abstract

Talk to us

Similar Papers

More From: Prostate Cancer and Prostatic Diseases

Lead the way for us

Similar Papers

Evaluation of accuracy, reliability, quality, and readability of online patient information materials on coccyx injury.
Emir Kaan İzci ... Fatih Keskin
Medicine | VOL. 102
Emir Kaan İzci, et. al.Emir Kaan İzci ... Fatih Keskin
20 Jan 2023
Medicine | VOL. 102

POS1458 HOW EASY IS IT FOR PATIENTS TO READ AND UNDERSTAND AVAILABLE PATIENT EDUCATIONAL MATERIALS FOR LUPUS?
U C Nweke ... J Meenakshi
Annals of the Rheumatic Diseases | VOL. 81
U C Nweke, et. al.U C Nweke ... J Meenakshi
23 May 2022
POS1458 HOW EASY IS IT FOR PATIENTS TO READ AND UNDERSTAND AVAILABLE PATIENT EDUCATIONAL MATERIALS FOR LUPUS?
U C Nweke ... J Meenakshi

Readability of patient education materials for bariatric surgery.
Adam Timothy Lucy ... Daniel Chu
Surgical Endoscopy | VOL. 37
Adam Timothy Lucy, et. al.Adam Timothy Lucy ... Daniel Chu
05 Jun 2023
Surgical Endoscopy | VOL. 37

Comparative Readability Assessment of Four Large Language Models in Answers to Common Contraception Questions [ID 2683638
Anisha V Patel ... Aisvarya Panakam
Obstetrics & Gynecology | VOL. 143
Anisha V Patel, et. al.Anisha V Patel ... Aisvarya Panakam
01 May 2024
Obstetrics & Gynecology | VOL. 143

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accuracy, readability, and understandability of large language models for prostate cancer information to the public.

Abstract

Talk to us

Similar Papers

More From: Prostate Cancer and Prostatic Diseases