Abstract

Nowadays, user-generated content (UGC) has become an important part of Internet user data. This study aims to develop an innovative user identification approach based on UGC platforms. To achieve the objective, this research proposed i) a web mining process to crawl UGC data; ii) a lead user identification index system for evaluating the innovation capability of users; and iii) a user classification process based on K-means clustering according to their UGC performance. Particularly, the complete user performance data of more than 100 users on Douban (one of the biggest UGC platforms in China) were collected, and the web mining, factor analysis, and clustering algorithm was integrated to process the data and classify user groups according to their UGC performance. The classification results were verified through incorporating expertise, and it showed that the classification can exactly recognize the users with proper lead userness. This research is expected to help small and medium enterprises without powerful big data ability to identify innovative users and valuable UGC data more efficiently and facilitate the further product improvement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call