Abstract

The investigation of user preferences through user comments has attracted significant attention. Although topic models have been verified as useful tools to facilitate the understanding of textual contents, they cannot be directly applied to accomplish this task because of two problems. The first problem is the severe data sparsity suffered by user comments because they are generally short. The second problem is the mixture of opinions from both user comments and the original documents the users commented on. To simultaneously solve the data sparsity problem and explore clean user preferences, we propose an author co-occurring topic model (AOTM) for normal documents and their short user comments. By considering authorship, AOTM allows each author of short texts to have a probability distribution over a set of topics represented only short texts. Accordingly, the individual user preferences can be investigated based on these author-level distributions. We verify the performance of AOTM using two news article datasets and one e-commerce dataset. Extensive experiments demonstrate that the AOTM outperforms several state-of-the-art methods in topic learning and topic representation of documents. The potential usage of AOTM in exploring individual user preferences is further illustrated by drawing user portraits and predicting user posting behaviors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call