Abstract
Analyzing user behavior in online spaces is an important task. This paper is dedicated to analyzing the online community in terms of topics. We present a user–topic model based on the latent Dirichlet allocation (LDA), as an application of topic modeling in a domain other than textual data. This model substitutes the concept of word occurrence in the original LDA method with user participation. The proposed method deals with many problems regarding topic modeling and user analysis, which include: inclusion of dynamic topics, visualization of user interaction networks, and event detection. We collected datasets from four online communities with different characteristics, and conducted experiments to demonstrate the effectiveness of our method by revealing interesting findings covering numerous aspects.
Highlights
We present a preprocessing method that should be considered when taking into account the exploitation of online community datasets, which enables the proposed method to capture both the temporal and thematic features of user behavior effectively
We performed topic modeling with the number of topics ranging from 2 to 64
As user–topic modeling is based on the assumption that users with similar latent topics are likely to be engaged in the same article, the proposed method can be used for clustering purposes
Summary
The online community is an important virtual space where information spreads and users express their opinions and emotions. Instead of developing a complex probabilistic model for analyzing user behavior, we adopted the simple form of the topic modeling method to ensure flexibility, but we make an important and effective substitution. The concept itself has been introduced in our previous work [18], we provide extensive experimental results in this work especially focusing on demonstrating the capability of analyzing user behavior in web communities from many perspectives, which the previous work did not cover. The proposed method is simple because it uses standard LDA with the small substitution of user participation for word occurrence It is flexible, because it does not use too many features, yet ensures sufficient functionality in many applications of user behavior analysis.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have