Responses of Five Different Artificial Intelligence Chatbots to the Top Searched Queries About Erectile Dysfunction: A Comparative Analysis

Mehmet Fatih Şahin,Hüseyin Ateş,Anıl Keleş,Rıdvan Özcan,Çağrı Doğan,Murat Akgül,Cenk Murat Yazıcı

doi:10.1007/s10916-024-02056-0

Mehmet Fatih Şahin, Hüseyin Ateş + Show 5 more

Open Access

https://doi.org/10.1007/s10916-024-02056-0

Copy DOI

Journal: Journal of Medical Systems	Publication Date: Apr 3, 2024
Citations: 5	License type: CC BY 4.0

Abstract

The aim of the study is to evaluate and compare the quality and readability of responses generated by five different artificial intelligence (AI) chatbots—ChatGPT, Bard, Bing, Ernie, and Copilot—to the top searched queries of erectile dysfunction (ED). Google Trends was used to identify ED-related relevant phrases. Each AI chatbot received a specific sequence of 25 frequently searched terms as input. Responses were evaluated using DISCERN, Ensuring Quality Information for Patients (EQIP), and Flesch-Kincaid Grade Level (FKGL) and Reading Ease (FKRE) metrics. The top three most frequently searched phrases were “erectile dysfunction cause”, “how to erectile dysfunction,” and “erectile dysfunction treatment.” Zimbabwe, Zambia, and Ghana exhibited the highest level of interest in ED. None of the AI chatbots achieved the necessary degree of readability. However, Bard exhibited significantly higher FKRE and FKGL ratings (p = 0.001), and Copilot achieved better EQIP and DISCERN ratings than the other chatbots (p = 0.001). Bard exhibited the simplest linguistic framework and posed the least challenge in terms of readability and comprehension, and Copilot’s text quality on ED was superior to the other chatbots. As new chatbots are introduced, their understandability and text quality increase, providing better guidance to patients.

Full Text