Abstract Background Paediatric trainees and parents of Kawasaki disease (KD) patients may face challenges when managing the disease due to insufficient accessibility to KD information. Chat Generative Pre-Trained Transformer (ChatGPT) based on large language models may help explain KD information to paediatric trainees and patient parents. Purpose This study aims to determine the usefulness of Chat Generative Pre- Trained Transformer -4 (ChatGPT-4) in explaining Kawasaki Disease (KD) and its management to parents and paediatric trainees managing KD patients. Methods We created 2 sets of clinical scenarios. In the first set, ChatGPT-4 was instructed to respond to 10 questions based on enquiries from parents of KD patients. Responses were scored in terms of "factual accuracy", "coherence", "comprehensiveness", and "humaneness" by 3 paediatric cardiologists using Likert scale of 0-10. Readability were calculated using Flesch reading-ease test. In the second set, ChatGPT-4 was instructed to respond to 8 KD-related questions based on enquiries from paediatric trainees. Responses are graded based on "relevance", "reliability" and "comprehensiveness" using Likert scale of 0-10 by 3 paediatric cardiologists independently. Reviewers would determine whether major advice from chatGPT-4 would be adopted in clinical judgement. Results For parent-targeted responses, ChatGPT-4 achieved the highest scores in ‘humaneness’ (median 9.00, IQR 8.00 to 9.00) and ‘coherence’ (median 8.00, IQR 7.00 to 8.00). Inaccurate information regarding disease prognosis, actions as well as prescription of medications and surgery is found in 80% of scenarios. Missing information regarding long-term coronary complications, antiplatelet management and cardiac assessments is found in all 10 scenarios. Mean readability of parent-targeted responses is 71.70 ± 6.26, a readability level easily understood by 12-year-olds. For paediatrician-targeted responses, ChatGPT-4 achieved the highest scores in ‘relevance’ (median 9.50, IQR 7.25 to 9.0). Inaccurate information regarding coronary interventions, patient education and immunization recommendations is spotted in 37.5% of the scenarios. Missing information regarding patient education and stress imaging is found in 25% of the scenarios. All reviewers would adopt ChatGPT-4’s advice in 87.5% of the scenarios. Conclusions ChatGPT-4 has significant limitations in accuracy and lacks salient information when providing KD recommendations for parents and paediatric trainees.Performance in parent-targeted questionsPerformance in trainee-targeted question
Read full abstract