Patient-directed Electronic Health Record (EHR) messaging is used as an adjunct to enhance patient-physician interactions but further burdens the physician. There is a need for clear electronic patient communication in all aspects of medicine, including plastic surgery. We can potentially utilize innovative communication tools like ChatGPT. This study assesses ChatGPT's effectiveness in answering breast reconstruction queries, comparing its accuracy, empathy, and readability with healthcare providers' responses. Ten deidentified questions regarding breast reconstruction were extracted from electronic messages. They were presented to ChatGPT3, ChatGPT4, plastic surgeons, and advanced practice providers for response. ChatGPT3 and ChatGPT4 were also prompted to give brief responses. Using 1-5 Likert scoring, accuracy and empathy were graded by 2 plastic surgeons and medical students, respectively. Readability was measured using Flesch Reading Ease. Grades were compared using 2-tailed t tests. Combined provider responses had better Flesch Reading Ease scores compared to all combined chatbot responses (53.3 ± 13.3 vs 36.0 ± 11.6, P < 0.001) and combined brief chatbot responses (53.3 ± 13.3 vs 34.7 ± 12.8, P < 0.001). Empathy scores were higher in all combined chatbot than in those from combined providers (2.9 ± 0.8 vs 2.0 ± 0.9, P < 0.001). There were no statistically significant differences in accuracy between combined providers and all combined chatbot responses (4.3 ± 0.9 vs 4.5 ± 0.6, P = 0.170) or combined brief chatbot responses (4.3 ± 0.9 vs 4.6 ± 0.6, P = 0.128). Amid the time constraints and complexities of plastic surgery decision making, our study underscores ChatGPT's potential to enhance patient communication. ChatGPT excels in empathy and accuracy, yet its readability presents limitations that should be addressed.
Read full abstract