Abstract

Microblog is a kind of new network medium which sprang up quickly. Detection and tracking of hot topics through Microblog has attracted wide attentions from scholars at home and abroad in recent years. The algorithm which aims at finding topics in long text messages such as in traditional news websites and blogs, etc. can't effectively be used in disposing the Microblog data with a property of sparseness. This paper contributes a method, which aims to identify hot topics in Microblog based on the topic words. This method, throughpre-treating the Microblog data and dividing the time-window, extracts topic words in the Microblog data according to the two factors of increasing rate of word frequency and relative word frequency from Microblog data in every time-window. And then extracts and clusters the topic words according to the similarity among them, sieving for a suitable cluster of topic words so as to describe the hot topic and realize the detection of hot topic in Microblog. Through experimental verification, this method can improve the efficiency of detection to a certain extent, and raise the recall ratio and the precision ratio, so as to find hot topic in Microblog effectively and timely.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call