Abstract

Online patient discussions in health-related social media forums represent a rich and increasingly important source for helping to determine the patient perspective concerning all aspects of their medical condition. Yet there is a common perception in the pharmaceutical industry that use of such data will lead to an overwhelming number of reportable adverse events (AEs). This perception can potentially hold back the use of health-related social media, and thereby hold back an understanding of the patient perspective. In this study, we set out to determine the frequency of reportable AEs in a large sample of patient posts. We used a combination of regular expressions and machine learning techniques on a collection of 10,000 posts obtained from cancer discussion forums, to detect posts that met all four of the required criteria for reporting of AEs. We first quantified the performance of the machine learning algorithm with a set of simulated testing posts that were randomly generated by inserting combinations of randomly generated postal addresses, email addresses, zip codes and telephone numbers together with AEs and treatment names. Potentially reportable posts were then manually reviewed. Under testing conditions with simulated posts, the machine learning algorithm identified reportable posts with an AUC of 0.928 and an overall accuracy of 88%. On the collection of 10,000 real posts from cancer forums the algorithm identified 505 potentially reportable posts for manual review. After manual review only two posts met all four criteria for reporting AEs. Whilst there is a concern that studies involving use of health-related social media discussions will lead to a large number of AEs being detected, this study found that posts meeting all four criteria for reporting are very rare, with only 0.1% of posts meeting the criteria for reporting.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.