Abstract

The online event discovery in social media based documents is useful, such as for disaster recognition and intervention. However, the diverse events incrementally identified from social media streams remain accumulated, ad hoc, and unstructured. They cannot assist users in digesting the tremendous amount of information and finding their interested events. Further, most of the existing work is challenged by jointly identifying incremental events and dynamically organizing them in an adaptive hierarchy. To address these problems, this article proposes d ynamic and h ierarchical C ategorization M odeling (dhCM) for social media stream. Instead of manually dividing the timeframe, a multimodal event miner exploits a density estimation technique to continuously capture the temporal influence between documents and incrementally identify online events in textual, temporal, and spatial spaces. At the same time, an adaptive categorization hierarchy is formed to automatically organize the documents into proper categories at multiple levels of granularities. In a nonparametric manner, dhCM accommodates the increasing complexity of data streams with automatically growing the categorization hierarchy over adaptive growth. A sequential Monte Carlo algorithm is used for the online inference of the dhCM parameters. Extensive experiments show that dhCM outperforms the state-of-the-art models in terms of term coherence, category abstraction and specialization, hierarchical affinity, and event categorization and discovery accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call