Current Strengths and Weaknesses of ChatGPT as a Resource for Radiation Oncology Patients and Providers

Warren Floyd,Troy Kleber,David J Carpenter,Melisa Pasli,Jamiluddin Qazi,Christina Huang,Jim Leng,Bradley G Ackerson,Matthew Pierpoint,Joseph K Salama,Matthew J Boyer

doi:10.1016/j.ijrobp.2023.10.020

Warren Floyd, Troy Kleber + Show 9 more

https://doi.org/10.1016/j.ijrobp.2023.10.020

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Purpose: ChatGPT, an artificial intelligence (AI) program that uses natural language processing to generate conversational-style responses to questions or inputs, is increasingly being used by both patients and healthcare professionals. This study aims to evaluate the accuracy and comprehensiveness of ChatGPT in radiation oncology-related domains, including answering common patient questions, summarizing landmark clinical research studies, and providing literature reviews with specific references supporting current standard of care clinical practice in radiation oncology. Methods: We assessed the performance of ChatGPT version 3.5 (ChatGPT3.5) in three areas. We evaluated ChatGPT3.5’s ability to answer 28 templated patient-centered questions applied across 9 cancer types. We then tested ChatGPT3.5’s ability to summarize specific portions of 10 landmark studies in radiation oncology. Next, we used ChatGPT3.5 to identify scientific studies supporting current standard of care practice in clinical radiation oncology for five different cancer types. Each response was graded independently by two reviewers, with discordant grades resolved by a third reviewer. Results: ChatGPT3.5 frequently generated inaccurate or incomplete responses. Only 39.7% of responses to patient-centered questions were considered correct and comprehensive. When summarizing landmark studies in radiation oncology, 35.0% of ChatGPT3.5′s responses were accurate and comprehensive, improving to 43.3% when provided the full text of the study. ChatGPT3.5′s ability to present a list of studies related to standard of care clinical practices was also unsatisfactory, with 50.6% of the provided studies fabricated. Conclusion: ChatGPT should not be considered a reliable radiation oncology resource for patients or providers at this time, as it frequently generates inaccurate or incomplete responses. However, natural language programming-based AI programs are rapidly evolving, and future versions of ChatGPT or similar programs may demonstrate improved performance in this domain.

Full Text