Abstract

为使图书馆工作人员免受大量冗余信息的困扰,实时了解广大师生的需求及关注热点,面向微信图书馆,本文给出一种基于LDA模型的微信热点话题检测方法。该方法首先通过构建图书馆领域专业词典合并特征词,其次应用LDA模型表示微信文本信息,最后采用主题相似度计算文本间的相似度,进而利用Single-Pass聚类算法识别热点话题。实验结果表明,该方法能够有效地对微信图书馆上的数据进行话题检测,在准确率、召回率和F1值上均有不错的效果。 In order to make the library staff relieve from a large amount of redundant information and real-time understanding of the needs of teachers and students, for WeChat library, in the paper, the method of hotspot topic detection based on model Latent Dirichlet Allocation (LDA) was pro-posed. The method first merged the characteristic words by constructing the professional dictionary in the library field, and then all the texts of WeChat were described by model LDA. Finally, the similarity between texts was calculated by topic similarity, and then the Single-Pass clustering algorithm was used to cluster WeChat data and found hotspot topics. The experimental results show that this method can effectively identify hotspot topics, and achieve good results in precision, recall and F-measure.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.