Comparing the dental knowledge of large language models.

Camila Tussie,Abraham Starosta

doi:10.1038/s41415-024-8015-2

Abstract

Introduction With the advancement of artificial intelligence, large language models (LLMs) have emerged as technology that can generate human-like text across various domains. They hold vast potential in the dental field, able to be integrated into clinical dentistry, administrative dentistry, and for student and patient education. However, the successful integration of LLMs into dentistry is reliant on the dental knowledge of the models used, as inaccuracies can lead to significant risks in patient care and education.Aims We are the first to compare different LLMs on their dental knowledge through testing the accuracy of different model responses to Integrated National Board Dental Examination (INBDE) questions.Methods We include closed-source and open-source models and analysed responses to both 'patient box' style board questions and more traditional, textual-based, multiple-choice questions.Results For the entire INBDE question bank, ChatGPT-4 had the highest dental knowledge, with an accuracy of 75.88%, followed by Claude-2.1 with 66.38% and then Mistral-Medium at 54.77%. There was a statistically significant difference in performance across all models.Conclusion Our results highlight the high potential of LLM integration into the dental field, the importance of which LLM is chosen when developing new technologies, and the limitations that must be overcome before unsupervised clinical integration can be adopted.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparing the dental knowledge of large language models.

Abstract

Talk to us

Similar Papers

More From: British dental journal

Lead the way for us

Similar Papers

Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study.
Kostis Giannakopoulos ... Vassilis Stamatopoulos
Journal of Medical Internet Research | VOL. 25
Kostis Giannakopoulos, et. al.Kostis Giannakopoulos ... Vassilis Stamatopoulos
28 Dec 2023
Journal of Medical Internet Research | VOL. 25

How Can IJDS Authors, Reviewers, and Editors Use (and Misuse) Generative AI?
Galit Shmueli ... W Nick Street
INFORMS Journal on Data Science | VOL. 2
Galit Shmueli, et. al.Galit Shmueli ... W Nick Street
01 Apr 2023
INFORMS Journal on Data Science | VOL. 2

Evidence-based potential of generative artificial intelligence large language models in orthodontics: a comparative study of ChatGPT, Google Bard, and Microsoft Bing.
Miltiadis A Makrygiannakis ... Eleftherios G Kaklamanos
European Journal of Orthodontics | VOL. -
Miltiadis A Makrygiannakis, et. al.Miltiadis A Makrygiannakis ... Eleftherios G Kaklamanos
13 Apr 2024
European Journal of Orthodontics | VOL. -

Implications of large language models such as ChatGPT for dental medicine.
Florin Eggmann ... Roland Weiger
Journal of Esthetic and Restorative Dentistry | VOL. 35
Florin Eggmann, et. al.Florin Eggmann ... Roland Weiger
05 Apr 2023
Journal of Esthetic and Restorative Dentistry | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparing the dental knowledge of large language models.

Abstract

Talk to us

Similar Papers

More From: British dental journal