Artificial intelligence (such as ChatGPT) augments patient education on medical topics, including reconstructive surgery. Herein we assess the information, misinformation, and readability of ChatGPT responses to reconstructive urology questions. We also evaluate prompt engineering to optimize responses. 125 questions were presented to ChatGPT (version 4o, OpenAI) and were divided into 6 domains: stress urinary continence, neurogenic bladder, urethral stricture, ureteral stricture, impotence, and Peyronie's disease. Quality of health information was assessed using DISCERN (1 [low] to 5 [high]). Understandability and actionability were assessed using Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P), (0 [low] -100% [high]). Misinformation was scored from 1 [no misinformation] to 5 [high misinformation]. Grade and reading level were calculated using the Flesch-Kincaid scale [5 (easy) to 16 (difficult), and 100-90 (5th grade level) to 10-0 (professional level), respectively]. Mean and median DISCERN scores were 3.63 and 5. PEMAT-P understandability was 85.3% but only 37.2% on actionability. There was little misinformation (mean, range: 1.23, 1-4). Responses were at a college graduate reading level.Using prompt engineering in the incontinence domain, scores for DISCERN (3.57 to 4.75, p=0.007), PEMAT-P understandability (89.6% to 96.2%, p<0.001), actionability (38.3% to 93.5%, p<0.001), and reading level (grade 12.4 to 5.4, p<001) all improved significantly while misinformation and word count did not change significantly. ChatGPT-4o's responses are high quality and understandability with little misinformation. Limitations include actionability and advanced reading level. Using prompt engineering, these deficiencies were addressed without increasing misinformation. ChatGPT-4o's can help augment reconstructive urology.
Read full abstract