231 Background: Non-small cell lung cancer (NSCLC) targeted therapies are complex, often requiring extensive patient education beyond what can be conveyed in a single provider visit. Given the average US reading level is at the 8th grade, and many patients seek additional information online, it is vital to evaluate the readability of online resources. This study aims to determine the patient friendliness of information provided by AI chatbots like ChatGPT in explaining NSCLC targeted therapies compared to other commonly accessed sources. Methods: This analysis included targeted therapies approved by the FDA for EGFR-mutated NSCLC: Afatinib, Erlotinib, Osimertinib, Dacomitinib, and Gefitinib. Information sources included ChatGPT 4, ChatGPT 3.5, the Patient Information section of the FDA label, and the Google featured snippet. Each version of ChatGPT was queried 10 times for each individual therapy – “My doctor is suggesting starting therapy with (Drug Name). What can I expect with this medication?”. All responses, labels, and snippets were examined by Microsoft Word for word count (WC) and Flesch Reading Ease Score (FRES), which considers average sentence length and number of syllables per word to assess readability. Data analysis was performed using ANOVA in SAS. Results: Across all therapies, Mean FRES for ChatGPT 4 was 31.21, ChatGPT 3.5 26.69, FDA labels 50.78, and Google snippets 33.1. Mean WC for ChatGPT 4 was 312.54, ChatGPT 3.5 355.78, FDA labels 1220, and Google snippets 42. Significant differences in FRES were found among sources for Afatinib (p-value <0.0001), Erlotinib (p-value 0.0206), Osimertinib (p-value <0.0001), Dacomitinib (p-value 0.0125), and Gefitinib (p-value <0.0001). Significant differences in WC were also found among sources for Afatinib (p-value <0.0001), Erlotinib (p-value <0.0001), Osimertinib (p-value <0.0001), and Gefitinib (p-value 0.0045). No significant difference in WC was found for Dacomitinib (p-value 0.2807). Conclusions: While the FDA label provided the most readable information, none of the sources aligned with the average 8th-grade reading level in the US, indicating a gap in accessible patient education for NSCLC targeted therapies. Notably, readability varies significantly between ChatGPT versions, affecting accessibility for users of the free version, ChatGPT 3.5. Although ChatGPT responses are more succinct compared to the verbose FDA labels, their higher education-level readability presents a barrier to comprehension. This study highlights the need for further research into optimizing AI-generated patient education materials to be both accurate and comprehensible at appropriate reading levels, thereby improving health communication strategies.
Read full abstract