Frequently Asked Questions Research Articles

Introduction Minimally invasive spine surgery (MISS) has evolved over the last three decades as a less invasive alternative to traditional spine surgery, offering benefits such as smaller incisions, faster recovery, and lower complication rates. With patients frequently seeking information about MISS online, the comprehensibility and accuracy of this information are crucial. Recent studies have shown that much of the online material regarding spine surgery exceeds the recommended readability levels, making it difficult for patients to understand. This study explores the clinical appropriateness and readability of responses generated by Chat Generative Pre-Trained Transformer (ChatGPT) to frequently asked questions (FAQs) about MISS. Methods A set of 15 FAQs was formulated based on clinical expertise and existing literature on MISS. Each question was independently inputted into ChatGPT five times, and the generated responses were evaluated by three neurosurgery attendings for clinical appropriateness. Appropriateness was judged based on accuracy, readability, and patient accessibility. Readability was assessed using seven standardized readability tests, including the Flesch-Kincaid Grade Level and Flesch Reading Ease (FRE) scores. Statistical analysis was performed to compare readability scores across preoperative, postoperative, and intraoperative/technical question categories. Results The mean readability scores for preoperative, postoperative, and intraoperative/technical questions were 15±2.8, 16±3, and 15.7±3.2, respectively, significantly exceeding the recommended sixth- to eighth-grade reading level for patient education (p=0.017). Differences in readability across individual questions were also statistically significant (p<0.001). All responses required a reading level above 11th grade, with a majority indicating college-level comprehension. Although preoperative and postoperative questions generally elicited clinically appropriate responses, 50% of intraoperative/technical questions yielded either "inappropriate" or "unreliable" responses, particularly for inquiries about radiation exposure and the use of lasers in MISS. Conclusions While ChatGPT is proficient in providing clinically appropriate responses to certain FAQs about MISS, it frequently produces responses that exceed the recommended readability level for patient education. This limitation suggests that its utility may be confined to highly educated patients, potentially exacerbating existing disparities in patient comprehension. Future AI-based patient education tools must prioritize clear and accessible communication, with oversight from medical professionals to ensure accuracy and appropriateness. Further research comparing ChatGPT's performance with other AI models could enhance its application in patient education across medical specialties.

Read full abstract

The consumer availability and automated response functions of chat generator pretrained transformer (ChatGPT-4), a large language model, poise this application to be utilized for patient health queries and may have a role in serving as an adjunct to minimize administrative and clinical burden. To evaluate the ability of ChatGPT-4 to respond to patient inquiries concerning ulnar collateral ligament (UCL) injuries and compare these results with the performance of Google. Cross-sectional study. Google Web Search was used as a benchmark, as it is the most widely used search engine worldwide and the only search engine that generates frequently asked questions (FAQs) when prompted with a query, allowing comparisons through a systematic approach. The query "ulnar collateral ligament reconstruction" was entered into Google, and the top 10 FAQs, answers, and their sources were recorded. ChatGPT-4 was prompted to perform a Google search of FAQs with the same query and to record the sources of answers for comparison. This process was again replicated to obtain 10 new questions requiring numeric instead of open-ended responses. Finally, responses were graded independently for clinical accuracy (grade 0 = inaccurate, grade 1 = somewhat accurate, grade 2 = accurate) by 2 fellowship-trained sports medicine surgeons (D.W.A, J.S.D.) blinded to the search engine and answer source. ChatGPT-4 used a greater proportion of academic sources than Google to provide answers to the top 10 FAQs, although this was not statistically significant (90% vs 50%; P = .14). In terms of question overlap, 40% of the most common questions on Google and ChatGPT-4 were the same. When comparing FAQs with numeric responses, 20% of answers were completely overlapping, 30% demonstrated partial overlap, and the remaining 50% did not demonstrate any overlap. All sources used by ChatGPT-4 to answer these FAQs were academic, while only 20% of sources used by Google were academic (P = .0007). The remaining Google sources included social media (40%), medical practices (20%), single-surgeon websites (10%), and commercial websites (10%). The mean (± standard deviation) accuracy for answers given by ChatGPT-4 was significantly greater compared with Google for the top 10 FAQs (1.9 ± 0.2 vs 1.2 ± 0.6; P = .001) and top 10 questions with numeric answers (1.8 ± 0.4 vs 1 ± 0.8; P = .013). ChatGPT-4 is capable of providing responses with clinically relevant content concerning UCL injuries and reconstruction. ChatGPT-4 utilized a greater proportion of academic websites to provide responses to FAQs representative of patient inquiries compared with Google Web Search and provided significantly more accurate answers. Moving forward, ChatGPT has the potential to be used as a clinical adjunct when answering queries about UCL injuries and reconstruction, but further validation is warranted before integrated or autonomous use in clinical settings.

Read full abstract

Frequently Asked Questions Research Articles

Related Topics

Articles published on Frequently Asked Questions

FAQ-Gen: An automated system to generate domain-specific FAQs to aid content comprehension

Is the information provided by large language models valid in educating patients about adolescent idiopathic scoliosis? An evaluation of content, clarity, and empathy : The perspective of the European Spine Study Group.

Evaluation of validity and reliability of AI Chatbots as public sources of information on dental trauma.

The utility of ChatGPT in gender-affirming mastectomy education

Assessing the Clinical Appropriateness and Practical Utility of ChatGPT as an Educational Resource for Patients Considering Minimally Invasive Spine Surgery.

Does Introducing FAQs Boost Self-Service Government?—A Study of Local Government Websites

ChatGPT for Addressing Patient-centered Frequently Asked Questions in Glaucoma Clinical Practice

Can ChatGPT answer patient questions regarding reverse shoulder arthroplasty?

A Multi-Lingual Conversational AI Chatbot for Effective Educational Consultations: A Study of ACE-DS, University of Rwanda

Practical Answers to Frequently Asked Questions in Anterior Cervical Spine Surgery for Degenerative Conditions.

Analyzing the performance of ChatGPT in answering inquiries about cervical cancer.

Answers to frequently asked questions about the pulsar timing array Hellings and Downs curve

Frequently asked questions to the 2023 Obesity Medicine Association Position Statement on Compounded Peptides: A call for action

Is ChatGPT a Reliable Source of Patient Information on Asthma?

Assessing ChatGPT Ability to Answer Frequently Asked Questions About Essential Tremor.

Improving access to buprenorphine for rural veterans in a learning health care system.

Understanding How ChatGPT May Become a Clinical Administrative Tool Through an Investigation on the Ability to Answer Common Patient Questions Concerning Ulnar Collateral Ligament Injuries.

Poster 372: ChatGPT is a Useful Tool for Patients with Acute Achilles Tendon Ruptures

ChatGPT and Google Provide Mostly Excellent or Satisfactory Responses to the Most Frequently Asked Patient Questions Related to Rotator Cuff Repair

Reliability of artificial intelligence chatbot responses to frequently asked questions in breast surgical oncology.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Frequently Asked Questions Research Articles

Related Topics

Articles published on Frequently Asked Questions

FAQ-Gen: An automated system to generate domain-specific FAQs to aid content comprehension

Is the information provided by large language models valid in educating patients about adolescent idiopathic scoliosis? An evaluation of content, clarity, and empathy : The perspective of the European Spine Study Group.

Evaluation of validity and reliability of AI Chatbots as public sources of information on dental trauma.

The utility of ChatGPT in gender-affirming mastectomy education

Assessing the Clinical Appropriateness and Practical Utility of ChatGPT as an Educational Resource for Patients Considering Minimally Invasive Spine Surgery.

Does Introducing FAQs Boost Self-Service Government?—A Study of Local Government Websites

ChatGPT for Addressing Patient-centered Frequently Asked Questions in Glaucoma Clinical Practice

Can ChatGPT answer patient questions regarding reverse shoulder arthroplasty?

A Multi-Lingual Conversational AI Chatbot for Effective Educational Consultations: A Study of ACE-DS, University of Rwanda

Practical Answers to Frequently Asked Questions in Anterior Cervical Spine Surgery for Degenerative Conditions.

Analyzing the performance of ChatGPT in answering inquiries about cervical cancer.

Answers to frequently asked questions about the pulsar timing array Hellings and Downs curve

Frequently asked questions to the 2023 Obesity Medicine Association Position Statement on Compounded Peptides: A call for action

Is ChatGPT a Reliable Source of Patient Information on Asthma?

Assessing ChatGPT Ability to Answer Frequently Asked Questions About Essential Tremor.

Improving access to buprenorphine for rural veterans in a learning health care system.

Understanding How ChatGPT May Become a Clinical Administrative Tool Through an Investigation on the Ability to Answer Common Patient Questions Concerning Ulnar Collateral Ligament Injuries.

Poster 372: ChatGPT is a Useful Tool for Patients with Acute Achilles Tendon Ruptures

ChatGPT and Google Provide Mostly Excellent or Satisfactory Responses to the Most Frequently Asked Patient Questions Related to Rotator Cuff Repair

Reliability of artificial intelligence chatbot responses to frequently asked questions in breast surgical oncology.