Abstract

Abstract Study question Can large language models be used to understand patient needs from conversational data? Summary answer Large language models can provide significant assistance for identifying and summarizing patients' queries. What is known already Traditionally, clinics have relied on techniques such as patient surveys, reviews and complaints procedures in order to understand shortcomings in the patient experience. As many clinics adopt digital communication platforms with patients, they have collected a wealth of conversational data that may shed light on the patient experience. However, the volume of data in clinic chat apps is often so great that analyzing them becomes challenging. Recently, advances in large language models (LLMs) have enabled the automated analysis of text at almost human level performance. This study therefore investigates whether LLMs can be used to extract insights from conversational data. Study design, size, duration This study is a retrospective analysis of 132,596 messages sent by patients to fertility advisors representing 40,853 questions asked. These conversations took place on a single centre’s patient communication app from 01/01/2021 to 09/06/2023. All patient types at all treatment stages were included. A private instance of the open-source Mistral-7B-Instruct-v0.1 LLM running on a single NVIDIA Titan X GPU was used for text analysis. Participants/materials, setting, methods Conversations were broken down into sentences and then categorized as either questions or non-questions by the LLM. Next the LLM categorized the individual questions, returning a category, a subcategory and question summary. These summaries were then embedded and clustered using the K-means algorithm (with k chosen by the elbow method). The LLM was then used to summarize the content of each cluster as five questions. Thematic analysis was then conducted by a patient experience expert. Main results and the role of chance In the initial phase of the study, the LLM, classified 34,222 questions into 6,177 categories and 13,533 subcategories. These were subsequently consolidated into 145 distinct clusters. Each cluster, on average, comprised 215±108 (M+SD) inquiries (excluding a notably larger outlier cluster that seemed to functionally encompass 3,300 inquiries that didn’t fit neatly into any other cluster). Examination of cluster centroids identified seven predominant themes: legal/financial (N = 1,218), general fertility advice (N = 1,430), patient administration (N = 8,734), medical tests (N = 4,820), medical procedures (N = 1,575), appointment scheduling (N = 4,320), and specific treatment/medication information (N = 9,125).Of the 145 clusters, 64 clusters comprising 12,313 inquiries were identified by the expert to highlight points for improvement in the patient experience. These broadly encompassed operational bottlenecks/weak points, issues with patient guidance and issues with communication. For instance, “Are there any late afternoon or evening appointment options available?” or “How long do eggs remain viable after they have been frozen?”. Overall, our use of LLMs enabled the analysis of a large number of queries that would previously have proven prohibitively expensive, time-consuming and labor-intensive. Limitations, reasons for caution Splitting conversations into sentences meant that context could not be taken into account and multi-message questions were hard to identify. Additionally, the interpretation of messages by LLMs and humans may not be aligned. Finally, the technical expertise required to execute this style of analysis may prove a barrier for clinics. Wider implications of the findings We demonstrate that LLMs can be used to draw insights from the wealth of digital communications held by modern IVF clinics. LLMs may thus enable clinics to collect data on the patient experience in a faster, more reliable way than traditional approaches such as patient surveys and complaints. Trial registration number Not applicable

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call