AbstractBackgroundStomas present significant lifestyle and psychological challenges for patients, requiring comprehensive education and support. Current educational methods have limitations in offering relevant information to the patient, highlighting a potential role for artificial intelligence (AI). This study examined the utility of AI in enhancing stoma therapy management following colorectal surgery.Material and MethodsWe compared the efficacy of four prominent large language models (LLM)—OpenAI's ChatGPT‐3.5 and ChatGPT‐4.0, Google's Gemini, and Bing's CoPilot—against a series of metrics to evaluate their suitability as supplementary clinical tools. Through qualitative and quantitative analyses, including readability scores (Flesch–Kincaid, Flesch‐Reading Ease, and Coleman‐Liau index) and reliability assessments (Likert scale, DISCERN score and QAMAI tool), the study aimed to assess the appropriateness of LLM‐generated advice for patients managing stomas.ResultsThere are varying degrees of readability and reliability across the evaluated models, with CoPilot and ChatGPT‐4 demonstrating superior performance in several key metrics such as readability and comprehensiveness. However, the study underscores the infant stage of LLM technology in clinical applications. All responses required high school to college level education to comprehend comfortably. While the LLMs addressed users' questions directly, the absence of incorporating patient‐specific factors such as past medical history generated broad and generic responses rather than offering tailored advice.ConclusionThe complexity of individual patient conditions can challenge AI systems. The use of LLMs in clinical settings holds promise for improving patient education and stoma management support, but requires careful consideration of the models' capabilities and the context of their use.
Read full abstract