Abstract

Micro-blog posts imply a large number of topics, which contain a lot of useful information as well as a lot of junk information making the micro-blog post topic a characteristic of high drift. The changes of micro-blog post topic over time and noises introduced with the increase of the number of micro-blog posts are two main aspects of micro-blog post topic drift. We propose a method of topic drift detection based on LDA model, using Gibbs sampling algorithm to obtain the probability distribution of micro-blog post words based on words correlation, identifying the topic boundary in dynamic constant method, extracting topic words by computing lexical information entropy in the topic field, and detecting the topic drift by topic words sequence alignment based on discrete-time model. According to the experiment on topic drift detection based on LDA model, we find our method very effective in micro-blog post topic drift detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.