Abstract

PurposeBased on user-generated content from a Chinese social media platform, this paper aims to investigate multiple methods of constructing user profiles and their effectiveness in predicting their gender, age and geographic location.Design/methodology/approachThis investigation collected 331,634 posts from 4,440 users of Sina Weibo. The data were divided into two parts, for training and testing . First, a vector space model and topic models were applied to construct user profiles. A classification model was then learned by a support vector machine according to the training data set. Finally, we used the classification model to predict users’ gender, age and geographic location in the testing data set.FindingsThe results revealed that in constructing user profiles, latent semantic analysis performed better on the task of predicting gender and age. By contrast, the method based on a traditional vector space model worked better in making predictions regarding the geographic location. In the process of applying a topic model to construct user profiles, the authors found that different prediction tasks should use different numbers of topics.Originality/valueThis study explores different user profile construction methods to predict Chinese social media network users’ gender, age and geographic location. The results of this paper will help to improve the quality of personal information gathered from social media platforms, and thereby improve personalized recommendation systems and personalized marketing.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.