Abstract

Topic modelling methods such as Latent Dirichlet Allocation (LDA) have been successfully applied to various fields, since these methods can effectively characterize document collections by using a mixture of semantically rich topics. So far, many models have been proposed. However, the existing models typically outperform on full analysis on the whole collection to find all topics but difficult to capture coherent and specifically meaningful topic representations. Furthermore, it is very challenging to incorporate user preferences into existing topic modelling methods to extract relevant topics. To address these problems, we develop a novel personalized Association-based Topic Selection (ATS) model, which can identify semantically valid and relevant topics from a set of raw topics based on the semantical relatedness between users’ preferences and the structured patterns captured in topics. The advantage of the proposed ATS model is that it enables an interactive topic modelling process driven by users’ specific interests. Based on three benchmark datasets, namely, RCV1, R8, and WT10G under the context of information filtering (IF) and information retrieval (IR), our rigorous experiments show that the proposed ATS model can effectively identify relevant topics with respect to users’ specific interests, and hence to improve the performance of IF and IR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call