Abstract
The text data of the social network platforms take the form of short texts, and the massive text data have high-dimensional and sparse characteristics, which does not make the traditional clustering algorithm perform well. In this paper, a new community detection method based on the sparse subspace clustering (SSC) algorithm is proposed to deal with the problem of sparsity and the high-dimensional characteristic of short texts in online social networks. The main ideal is as follows. First, the structured data including users’ attributions and user behavior and unstructured data such as user reviews are used to construct the vector space for the network. And the similarity of the feature words is calculated by the location relation of the feature words in the synonym word forest. Then, the dimensions of data are deduced based on the principal component analysis in order to improve the clustering accuracy. Further, a new community detection method of social network members based on the SSC is proposed. Finally, experiments on several data sets are performed and compared with the K-means clustering algorithm. Experimental results show that proper dimension reduction for high dimensional data can improve the clustering accuracy and efficiency of the SSC approach. The proposed method can achieve suitable community partition effect on online social network data sets.
Highlights
Smartphones are becoming the main way for people to receive information
Based on the semantic features of the text, this paper proposes a new online social e-commerce consumer grouping detection method based on subspace clustering (SSC), which may solve the curse of dimensionality and sparsity in short text clustering to some extent effectively
In order to deal with this problem, this paper proposes a new method of social network member community discovery based on SSC, which collects users’ structured data and unstructured data to construct a vector space model and establish the similarity of feature words
Summary
Smartphones are becoming the main way for people to receive information. Mobile applications are roughly divided into four categories: instant messaging, search engines, online news, and social applications [1]. Social applications have the advantages of convenient and simple operation and have become the main way for people to communicate remotely. Based on the interpersonal network relationship, social media such as WeChat and Microblog strengthen the user viscosity and play an important guiding role in the process of consumers’ access to shopping information, which provides a good solution for e-commerce promotion and reduces the cost of the business. Based on the increased relationship flow and information flow of the platform, the Microblog network has the advantages of a large user base, high user activity, and user interest orientation. The instantaneity and openness of the information platform promotes the gathering of users with common values and interests, and user communities will form gradually [3]. The current community detection methods usually divide users based on their basic attributes (age, gender, educational level, place of birth, etc.), which do not reveal the characteristics of user groups based on product information [4]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.