Abstract

Background and aimsWe aimed to evaluate the precision, medical accuracy, superfluous content, and consistency of ChatGPT's responses to commonly asked questions about endoscopic procedures and its capability to provide emotional support, comparing its performance with the Generative Pre-trained Transformer 4 (GPT-4) model. MethodsA set of 113 questions related to esophagogastroduodenoscopy (EGD), colonoscopy, endoscopic ultrasound (EUS), and endoscopic retrograde cholangiopancreatography (ERCP) were curated from professional societies and institutional web pages. Responses from ChatGPT were generated and subsequently graded by board-certified gastroenterologists and advanced endoscopists. The emotional support efficacy of ChatGPT and GPT-4 was also assessed by a board-certified psychiatrist (LSM). ResultsChatGPT exhibited moderate precision in answering questions about EGD (57.9% comprehensive), colonoscopy (47.6% comprehensive), EUS (48.1% comprehensive), and ERCP (44.4% comprehensive). Medical accuracy was highest for EGD (52.6% fully accurate) and lowest for EUS (40.7% fully accurate). Concerning superfluous content, responses were predominantly concise for EGD and colonoscopy, with ERCP and EUS showing increased extraneous content. Reproducibility scores varied across domains, ranging from 50.34% (for EUS) to 68.6% (for EGD). GPT-4 outperformed ChatGPT in emotional support, though both models exhibited satisfactory performance. ConclusionChatGPT delivers moderately precise and medically accurate answers related to common endoscopic procedures with varying levels of extraneous content. It holds promise as a supplementary information resource for both patients and healthcare professionals.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call