Abstract

We propose a method to simplify textual Twitter data into understandable networks of terms that can signify important events and their possible changes over time. The method allows for common characteristics of the networks across time periods and each period can comprise multiple unknown sub-networks. The networks are described by Gaussian graphical models and their parameter values are estimated through a Bayesian approach with a fused lasso-type prior on the precision matrices of the underlying mixtures of the sub-models. A flexible data allocation scheme is at the heart of an MCMC algorithm to recover mean and covariance parameters of the mixture components. Several implementations of the outlined estimation procedure are studied and compared based on simulated data. The procedure with the highest predictive power is used for mining tweets regarding the 2009 Iranian presidential election.

Highlights

  • Twitter is a prominent social media tool that provides a rich resource of information

  • As an example, studied later in detail, more than one million tweets related to the social upheaval surrounding the 2009 Iranian presidential election may be compressed into an accessible visual summary

  • The performance of the proposed methods was assessed via a simulation analysis with three objectives: i) evaluation of the performances of the two approaches Bayesian stage-wise (BS) and Bayesian fused (BF) in recovering graphical networks corresponding to multiple data sets, ii) comparison of the performance of Bayesian stage-wise mixture model (BSM) and Bayesian fused mixture model (BFM) in the proposed mixture context, and iii) assessment of the accuracy of the cluster assignment scheme DICRP and its comparison to that of the original Chinese restaurant process (CRP) in the present context

Read more

Summary

Introduction

Twitter is a prominent social media tool that provides a rich resource of information. As an example, studied later in detail, more than one million tweets related to the social upheaval surrounding the 2009 Iranian presidential election may be compressed into an accessible visual summary. Such summary information can entail different topics that are highlighted in a certain period of time and have evolved over time. This can be viewed as a form of network reconstruction where collections of linked words, concepts or terms represent highlighted topics at a certain time-stamp. Any changes in such topics over time, from becoming outdated, expanded or created, can be explained by evolution of the links between the words

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.