Abstract

This paper develops techniques to mine medical and health-related social media content for information and opinions about diseases and treatments. This would allow users to benefit from the numerous postings of people’s experiences scattered all over the Web. This paper reports the results of a textual analysis of user-generated content on drug-related online sites. The aim is to determine what kinds of content can be expected on these sites, from linguistic and information points-of-view. User postings were harvested from two websites carrying different kinds of user-generated content and compared to information on the same drugs from a hospital information portal. The corpus was analyzed to identify what kinds of drugs were often reviewed, the vocabulary and medical concepts used, and the textual characteristics such as length of postings, sentence length, and part-of-speech distribution. Although drug-related user-generated content is very different from editorial content, the results of this project show that it provides useful information on many drugs. From a linguistic point-of-view, user-generated content is simpler in language and more informal in style than editorial content, but still contains useful health information that tends to be patient-oriented.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call