Abstract
In recent text mining research, there is a trend in analyzing the burst features of specific entity such as a word, a meme or a document in text streams. Such burst features can be efficiently and robustly identified by Kleinberg's two-state automaton model. However, the two parameters of the model, which is manually set, have heavily affected the performance of the model. In this paper, the function of the two parameters is examined, and two algorithms are proposed for the estimation of the two parameters. Experiments with public news corpora prove that our estimation can maximize the reliability of the detection results and remove the noisy burst features effectively.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have