Abstract

While the so called Big Social Data – a generic term referring to the massive amount of data automatically collected from various sources such as search queries and text generated from social media – have emerged as an important theme in computational sciences for purposes such as predicting the movement of stock prices, trends in diseases, and rates of adverse drug effects, studies that directly compares results from such analyses and those derived from formal studies such as surveillance and clinical trials are rare and far-between. Some studies have seemed to suggest that the results from the analysis of Big Social Data may not entirely agree with observed results or results from structured surveys. For example, the rate of influenza as predicted from Google-based search data was found to overestimate the actual rate. However, the number of such important comparative studies is still very limited. In this paper, we compare findings derived from data collected from a medical forum on the Internet and those derived from a formal clinical study. Specifically, we examine the co-occurrence of keywords in symptoms from participants on a breast cancer online forum and directly compare the symptom cluster patterns to data obtained from a clinical study of breast cancer survivors. The clinical study used data from a highly structured symptom checklist collected from N = 653 breast cancer-survivors. Our findings suggest that the symptom clusters obtained from the two studies have substantial overlap, but inconsistencies do remain, especially for context-sensitive symptom items. In summary, the study demonstrates the potential of mining unstructured, text-based, online forum data for supplementing and validating structured quantitative data collected from clinical studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call