SummaryOnline hot event discovery has become a flourishing frontier where online document streams are monitored to discover newly occurring events or assigned to previously detected events. However, hot events have the nature to evolve, and their inherent topically related words are also likely to evolve. It makes event discovery a challenging task for traditional‐mining approaches. Combining word association and semantic community, Association Link Network (ALN) organizes the loosely distributed associated resources. This paper presents an ALN‐based novel online hot event discovery approach. Technically, this approach is enacted around three stages. In the first stage, we extract significant features to represent the content of each document from the online document stream. During the second stage, we classify the online document stream into topically related detected events considering event evolution in the form of ALN. At the third stage, we create an ALN‐based event detection algorithm, which is used to timely discover newly occurring hot events. The online datasets used in our empirical studies are acquired from Baidu News, which spans a range of 1315 hot events and 236,300 documents. Experimental results demonstrate the hot events discovery ability with respect to high accuracy, good scalability, and short runtime. Copyright © 2014 John Wiley & Sons, Ltd.
Read full abstract