Navigating Intimate Partner Violence: A Comparative Analysis Between ACOG and ChatGPT on Patient Education [ID 2683552

Angelo Cadiente,Antonia F Oladipo,Puja Patel

doi:10.1097/01.aog.0001013984.40432.dc

Abstract

INTRODUCTION: With the surge in artificial intelligence usage, there is an increasing need to assess its applicability in conveying public health information. We compare the readability and quality of responses between ChatGPT and the American College of Obstetricians and Gynecologists (ACOG) in answering frequently asked questions (FAQs) on intimate partner violence (IPV). METHODS: Twelve questions from ACOG's “Intimate Partner Violence” FAQs were posed to ChatGPT-3.5 (July 19 Version). Readability and grade-level scores were determined. The quality of responses were also graded by two obstetrician–gynecologists using a 1–4 scale, where 1 represents a comprehensive response and 4 indicates an incorrect response. Statistical analysis utilized a two-tailed t-test. A weighted Cohen’s kappa coefficient evaluated interrater reliability. RESULTS: Mean readability favored ACOG over ChatGPT, but only the Coleman Liau Index was statistically significant (ACOG: 12.71; ChatGPT: 15.76; P=.003). Other readability measures demonstrated no significant differences. The ACOG responses were graded with an average of 1.29, alongside a Cohen’s kappa coefficient of 0.375, implying fair agreement between graders. All ChatGPT’s responses were graded a 1 with a Cohen’s kappa coefficient of 1, indicating perfect agreement. The difference in grades between ACOG and ChatGPT was statistically significant (P=.013). CONCLUSION: The ACOG FAQ responses are relatively equivalent in readability when compared to ChatGPT-generated responses with only one statistically significant difference across the indices. However, ChatGPT's responses were more comprehensive and accurate. Although patients can use ACOG as a source, ChatGPT provides equivalently clear and more comprehensive information on IPV.

Full Text