Abstract

Social networks have become an important part of human life. There have been recently several studies on using Latent Dirichlet Allocation (LDA) to analyze text corpora extracted from social platforms to discover underlying patterns of user data. However, when we wish to discover the major contents of a social network (e.g., Facebook) on a large scale, the available approaches need to collect and process published data of every person on the social network. This is against privacy rights as well as time and resource consuming. This paper tackles this problem by focusing on fan pages, a class of special accounts on Facebook that have much more impact than those of regular individuals. We proposed a vector representation for Facebook fan pages by using a combination of LDA-based topic distributions and interaction indices of their posts. The interaction index of each post is computed based on the number of reactions and comments, and works as the weight of that post in making of the topic distribution of a fan page. The proposed representation shows its effectiveness in fan page topic mining and clustering tasks when experimented on a collection of Vietnamese Facebook fan pages. The inclusion of interaction indices of the posts increases the fan page clustering performance by 9.0% on Silhouette score in the case of optimal number of clusters when using K-means clustering algorithm. These results will help us to build a system that can track trending contents on Facebook without acquiring the individual user’s data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call