Abstract Background ChatGPT is a large language model based chatbot created by OpenAI. Since its release, ChatGPT has gained widespread attention among the healthcare community regarding its potential utility as a medical practice tool. In addition, ChatGPT can be adapted to serve as a clinical-decision support tool. In this study we explored the potential of ChatGPT as a decision support tool for acute Ulcerative Colitis (UC) presentations in the setting of the emergency department (ED). Methods Our investigation centered around 20 distinct acute UC presentations to the ED, accumulated over two years. Case summaries - embodying crucial data points such as symptoms, vital signs, and laboratory results - were processed by ChatGPT. For each case, we asked ChatGPT to assess disease severity based on the TrueLove and Witts classification, substituting erythrocyte sedimentation rate ≥30 with C-Reactive protein ≥12. Furthermore, it was to recommend hospitalization or outpatient care for each case based on the disease severity. The answers were compared with assessments made by our department's gastroenterologists and the actual decision made by the physician in the ED. Results Overall, ChatGPT categorized 12, 7 and 1 patient with severe, moderate and mild disease, respectively. For each case, ChatGPT supplied a detailed answer depicting severity of every variable of the criteria and an overall severity classification (table 1). Compared to our gastroenterologists’ assessments, ChatGPT graded 16/20 (80%) of the patients with the same severity. A high degree of reliability was found between the two assessments as the average measure intra-class correlation coefficient of absolute agreement was 0.839 (95% confidence interval 0.588-0.937, F= 5.95, p<0.001). Inconsistencies in four cases stemmed primarily from inaccurate cut-off values for systemic variables. Following severity assessment, ChatGPT leaned towards hospitalization for 16 out of 18 (88.9%) patients. For two moderate UC cases, however, it could not provide a decisive recommendation. Comparatively, only 12 out of the 20 patients were hospitalized in actual clinical practice. Conclusion In this unique study, findings suggest that Chat-GPT, has potential as a clinical decision-support tool in assessing UC severity and recommending suitable settings for further treatment. While this concept warrants further investigation and validation, its ability to evaluate a clinical scenario based on established criteria could greatly benefit the field of Inflammatory bowel disease and gastroenterology.
Read full abstract