Abstract

To develop a content-aware chatbot based on GPT-3.5-Turbo and GPT-4 with specialized knowledge on the German S2 Cone-Beam CT (CBCT) dental imaging guideline and to compare the performance against humans. The LlamaIndex software library was used to integrate the guideline context into the chatbots. Based on the CBCT S2 guideline, 40 questions were posed to content-aware chatbots and early career and senior practitioners with different levels of experience served as reference. The chatbots' performance was compared in terms of recommendation accuracy and explanation quality. Chi-square test and one-tailed Wilcoxon signed rank test evaluated accuracy and explanation quality, respectively. The GPT-4 based chatbot provided 100% correct recommendations and superior explanation quality compared to the one based on GPT3.5-Turbo (87.5% vs. 57.5% for GPT-3.5-Turbo; P = .003). Moreover, it outperformed early career practitioners in correct answers (P = .002 and P = .032) and earned higher trust than the chatbot using GPT-3.5-Turbo (P = 0.006). A content-aware chatbot using GPT-4 reliably provided recommendations according to current consensus guidelines. The responses were deemed trustworthy and transparent, and therefore facilitate the integration of artificial intelligence into clinical decision-making.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.