Abstract

Short text stream classification suffers from enormous challenges, due to the sparsity, high dimension and rapid variability of the short text stream. In this paper, we present a short text stream classification approach refined from online Biterm Topic Model (BTM) using short text expansion and concept drifting detection. Specifically, in our method, we firstly extend short text streams from an external resource to make up for the sparsity of data, and use online BTM to select representative topics instead of the word vector to represent the feature of short texts. Secondly, we propose a concept drift detection method based on the topic model to detect the hidden concept drifts in short text streams. Thirdly, we build an ensemble model using several data chunks and update with the newest data chunk and results of the concept drift detection. Finally, extensive experimental results demonstrate that compared to well-known baselines, our approach achieves a better performance in the classification and concept drifting detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call