Abstract

Abstract The development of Information and Communication Technologies (ICT) and Web 2.0 promotes the emergence of diverse social ecosystems like social Internet of Things (IoT), social media and online communities. User-generated textual content (UGTC), which consists of unstructured texts, is the most important and common type of user-generated content in social ecosystems. UGTC in social ecosystems is generated according to two types of context information—global context (topics) and local context (semantic regularities). For UGTC modeling, topic models just consider global context but ignore semantic regularities, while semantic embedding models are on the opposite. So only utilizing topic models or semantic embedding models to model UGTC suffers from some drawbacks. For this problem, we propose a semantic embedding enhanced topic model named SEE-Twitter-LDA for accurately modeling UGTC in social ecosystems. The core of SEE-Twitter-LDA is that words are generated according to mutual semantic information of topics and semantic regularities. So global context and local context are jointly considered for UGTC modeling. By utilizing 553 098 tweets sampled from Twitter and 211 233 posts sampled from Weibo, we validate SEE-Twitter-LDA’s better performance on perplexity, topic divergence and topic coherence versus existing related models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call