Abstract

With the popularity of social networks, users interact with each other and comment on current events through online social network more and more frequently, how to extract the hot topic has become the focus of natural language processing research. In this paper, we propose a hot topic extraction method based on the popularity of micro-blog. First we use cross-entropy to define the heat of micro-blog according to the number of its comment and forwarding. Then combining the heat of micro-blog and word2vec model to assign weight for each word, and we apply bidirectional LSTM(Long Short Term Memory) to conduct document semantic coding and single-pass method for topic mining. Besides, We separately introduce three evaluation indicators to test the proposed method: UMI(Normalized Information), PMI(Point-wise Mutual Information)and Purity. We used crawlers to crawl over 10,000 micro-blogs in 15 hot topics in 2017 in Sina Weibo, the experiment results show that the proposed method performs better and has stronger robustness than the traditional topic detection method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call