Abstract

Short text stream classification suffers from enormous challenges, due to the sparsity, high dimension and rapid variability of the short text stream. In this paper, we present a short text stream classification approach refined from online Biterm Topic Model (BTM) using short text expansion and concept drifting detection. Specifically, in our method, we firstly extend short text streams from an external resource to make up for the sparsity of data, and use online BTM to select representative topics instead of the word vector to represent the feature of short texts. Secondly, we propose a concept drift detection method based on the topic model to detect the hidden concept drifts in short text streams. Thirdly, we build an ensemble model using several data chunks and update with the newest data chunk and results of the concept drift detection. Finally, extensive experimental results demonstrate that compared to well-known baselines, our approach achieves a better performance in the classification and concept drifting detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.