Abstract

The aim of topic detection is to automatically identify the events and hot topics in social networks and continuously track known topics. Applying the traditional methods such as Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis is difficult given the high dimensionality of massive event texts and the short-text sparsity problems of social networks. The problem also exists of unclear topics caused by the sparse distribution of topics. To solve the above challenge, we propose a novel word embedding topic model by combining the topic model and the continuous bag-of-words mode (Cbow) method in word embedding method, named Cbow Topic Model (CTM), for topic detection and summary in social networks. We conduct similar word clustering of the target social network text dataset by introducing the classic Cbow word vectorization method, which can effectively learn the internal relationship between words and reduce the dimensionality of the input texts. We employ the topic model-to-model short text for effectively weakening the sparsity problem of social network texts. To detect and summarize the topic, we propose a topic detection method by leveraging similarity computing for social networks. We collected a Sina microblog dataset to conduct various experiments. The experimental results demonstrate that the CTM method is superior to the existing topic model method.

Highlights

  • In recent years, the rapid development of online social networks, such as Twitter, Facebook, Sina Weibo, has greatly affected people’s social and working styles

  • The Cbow Topic Model (CTM) topic detection method proposed in this paper considers the high dimensionality of the massive texts in the social network and the sparse characteristics of short texts

  • We propose a novel word embedding topic model for topic detection and a summary method by combining a continuous bag-ofwords mode (Cbow) event vectorization model and aggregated-document topic model

Read more

Summary

Introduction

The rapid development of online social networks, such as Twitter, Facebook, Sina Weibo, has greatly affected people’s social and working styles. It is necessary to detect the social topics and emergencies, and to identify all kinds of events, so as to purify the network environment and improve the social atmosphere. These approaches can be used to help grasp public sentiment and public opinion, and provide the basis for government decision-making. Researchers proposed a method based on a vector space model and a statistical language model to implement the event monitoring modeling process. The above approaches can overcome the sparsity problem in the social network context by modeling the different features or attributes These methods ignore the importance of relationships between words and require complex heuristic processing. To effectively detect and summarize topics, we propose a detection and summary topic method by adopting the similarity computing approach in the proposed topic model

Related work
Experimental setup
Result analysis
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call