Forum latent Dirichlet allocation for user interest discovery

Chaotao Chen,Jiangtao Ren

doi:10.1016/j.knosys.2017.04.006

Abstract

The popularity of online forums provides a good opportunity to learn user interests which can be used in many business scenarios, such as product or news recommendation. There exist many approaches to infer forum topics and users’ interests. Among them, Author-Topic (AT) like models are most popular. But a thread in online forum is composed of a root post and some response posts which may be relevant or irrelevant to the root post. So the assumption of AT that response posts are generated from user’s interest topics is not comprehensive. In this paper, we distinguish user’s serious and unserious interest topics and argue that the topic of a relevant response post is jointly determined by its author’s serious interest topics and the topics of its root post, while the topic of irrelevant response post is only determined by its author’s unserious interest topics. Based on these assumptions, we propose Forum-LDA to model the generative process of root post, relevant and irrelevant response posts jointly. Therefore, our model can not only learn more coherent topics and serious interests, but also identify unserious users who publish many irrelevant posts. Extensive experiments on real forum dataset demonstrate the advantages of our model in tasks such as user interest and unserious user discovery.

Full Text