Combining Topic Modeling, Sentiment Analysis, and Corpus Linguistics to Analyze Unstructured Web-Based Patient Experience Data: Case Study of Modafinil Experiences.

Julia Walsh,Jonathan Cave,Frances Griffiths

doi:10.2196/54321

Abstract

Patient experience data from social media offer patient-centered perspectives on disease, treatments, and health service delivery. Current guidelines typically rely on systematic reviews, while qualitative health studies are often seen as anecdotal and nongeneralizable. This study explores combining personal health experiences from multiple sources to create generalizable evidence. The study aims to (1) investigate how combining unsupervised natural language processing (NLP) and corpus linguistics can explore patient perspectives from a large unstructured dataset of modafinil experiences, (2) compare findings with Cochrane meta-analyses on modafinil's effectiveness, and (3) develop a methodology for analyzing such data. Using 69,022 posts from 790 sources, we used a variety of NLP and corpus techniques to analyze the data, including data cleaning techniques to maximize post context, Python for NLP techniques, and Sketch Engine for linguistic analysis. We used multiple topic mining approaches, such as latent Dirichlet allocation, nonnegative matrix factorization, and word-embedding methods. Sentiment analysis used TextBlob and Valence Aware Dictionary and Sentiment Reasoner, while corpus methods including collocation, concordance, and n-gram generation. Previous work had mapped topic mining to themes, such as health conditions, reasons for taking modafinil, symptom impacts, dosage, side effects, effectiveness, and treatment comparisons. Key findings of the study included modafinil use across 166 health conditions, most frequently narcolepsy, multiple sclerosis, attention-deficit disorder, anxiety, sleep apnea, depression, bipolar disorder, chronic fatigue syndrome, fibromyalgia, and chronic disease. Word-embedding topic modeling mapped 70% of posts to predefined themes, while sentiment analysis revealed 65% positive responses, 6% neutral responses, and 28% negative responses. Notably, the perceived effectiveness of modafinil for various conditions strongly contrasts with the findings of existing randomized controlled trials and systematic reviews, which conclude insufficient or low-quality evidence of effectiveness. This study demonstrated the value of combining NLP with linguistic techniques for analyzing large unstructured text datasets. Despite varying opinions, findings were methodologically consistent and challenged existing clinical evidence. This suggests that patient-generated data could potentially provide valuable insights into treatment outcomes, potentially improving clinical understanding and patient care.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Combining Topic Modeling, Sentiment Analysis, and Corpus Linguistics to Analyze Unstructured Web-Based Patient Experience Data: Case Study of Modafinil Experiences.

Abstract

Talk to us

Similar Papers

More From: Journal of medical Internet research

Lead the way for us

Similar Papers

Discussions of Cannabis Over Patient Portal Secure Messaging: Content Analysis.
Vishal A Shetty ... Eric A Wright
Journal of medical Internet research | VOL. 26
Vishal A Shetty, et. al.Vishal A Shetty ... Eric A Wright
12 Dec 2024
Journal of medical Internet research | VOL. 26

Combining Topic Modeling, Sentiment Analysis, and Corpus Linguistics to Analyze Unstructured Web-Based Patient Experience Data: Case Study of Modafinil Experiences.
Julia Walsh ... Frances Griffiths
Journal of medical Internet research | VOL. 26
Julia Walsh, et. al.Julia Walsh ... Frances Griffiths
11 Dec 2024
Journal of medical Internet research | VOL. 26

Large Language Models and Empathy: Systematic Review.
Vera Sorin ... Eyal Klang
Journal of medical Internet research | VOL. 26
Vera Sorin, et. al.Vera Sorin ... Eyal Klang
11 Dec 2024
Journal of medical Internet research | VOL. 26

Development and Validation of a Literature Screening Tool: Few-Shot Learning Approach in Systematic Reviews.
Phongphat Wiwatthanasetthakarn ... Ammarin Thakkinstian
Journal of medical Internet research | VOL. 26
Phongphat Wiwatthanasetthakarn, et. al.Phongphat Wiwatthanasetthakarn ... Ammarin Thakkinstian
11 Dec 2024
Journal of medical Internet research | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Combining Topic Modeling, Sentiment Analysis, and Corpus Linguistics to Analyze Unstructured Web-Based Patient Experience Data: Case Study of Modafinil Experiences.

Abstract

Talk to us

Similar Papers

More From: Journal of medical Internet research