Abstract

In this article, we characterize health misinformation infiltration as a dynamic dissemination process on social media in addition to content-based features. Using Zika discussion on Twitter in 2016 as the study system, we identified 264 most influential tweets with misinformation and matched 455 tweets with real information. We developed an algorithm to infer information dissemination network through retweeting for each tweet, and extracted nine network metrics. We then approximated information dissemination as nonhomogeneous Poisson process (NHPP) signal. We then extracted 40 signal features to characterize each NHPP. For content-based features, we applied both linguistic inquiry and word count and document-to-vector to further extract 63 and 50 features for each tweet, respectively. Finally, we also considered four user features. Based on these extracted feature categories, we trained support vector machine and random forest (RF) classifiers. Using all feature categories combined as input, an RF classifier achieved > 83% accuracy and > 90% AUC to detect misinformation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call