Abstract Introduction ChatGPT is an artificial intelligence (AI) platform with expanding uses across society, including the solicitation of medical advice by patients. Traditionally, men’s health patients obtained educational information through the Urology Care Foundation (UCF) or institutional websites. Today, men are increasingly turning to social media and AI given improving technology and prevalent stigma surrounding sexual health issues. Previous studies have demonstrated the ability of ChatGPT to perform certain physician tasks, but studies of its patient-facing use are limited. Most online health educational information far exceeds the recommended American sixth grade reading level, as defined by the NIH and AMA. Hence, existing resources are largely inaccessible to large swaths of the population. In this study, we questioned whether AI holds may help improve the provision of health educational information for the public. Objective To conduct the first analysis of ChatGPT-created information regarding men’s sexual health conditions and provide statistical comparison with resources produced by UCF. Methods Frequently asked patient questions regarding erectile dysfunction, premature ejaculation, low testosterone, sperm retrieval, penile augmentation, and male infertility were compiled from the American Urological Association. Questions included definition of the condition, etiology, diagnostics, treatment, prognosis, and common patient concerns. Responses from UCF and ChatGPT were compared using the following validated readability formulas: Flesch Reading Ease, Flesch-Kincaid Grade Level, Gunning-Fog Index, Simple Measure of Gobbledygook, Coleman-Liau Index, and Automated Readability Index. Descriptive analysis of word count, sentence length, syllable count, and word complexity was also performed. Furthermore, adjusted ChatGPT (ChatGPT-a) responses were generated by prompting the language model with the command “Explain it to me like I am in sixth grade.” Finally, all responses were further graded by two independent reviewers for accuracy and comprehensiveness. Statistical comparisons were made using Welch’s unpaired t-test. Results Readability scores are presented in Table 1. UCF was significantly more readable than ChatGPT for all six sexual medicine topics across all scoring metrics (all p<0.001). FKGL was 8.87 for UCF versus 14.83 for ChatGPT. ChatGPT responses were longer (278.3 versus 222.9 words) and included more complex words (28.1% versus 14.3%). When prompted with a command for more accessible language (ChatGPT-a), responses approached the readability of UCF across all metrics, including an average Flesch Reading Ease of 54.8 and FKGL of 9.6. UCF and ChatGPT had equal quality and accuracy on qualitative analysis. Conclusions Men’s health information provided by ChatGPT is less accessible when compared to UCF, although both platforms exceed the recommended sixth grade level. Given its sensitive nature, sexual medicine information is increasingly sought out online. Our findings indicate that AI can simplify online information to accommodate an individual user’s health literacy, but improvement in the current platform is needed. Future iterations of ChatGPT may be adapted towards the provision of medical information and trained on evidence-based literature, hence improving both readability and quality. As AI grows, providers must study this new platform to better understand and assist their patients. Disclosure No.
Read full abstract