소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결

Minsung Kim,Il Im

doi:10.13088/jiis.2014.20.2.137

Abstract

Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem). This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize `degree centrality` in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows: Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user). Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized. Step 3: Ordinary CF algorithm is applied to the remaining dataset. Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A `popular item` method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric. In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset - the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing `Best-N-neighbors` and `Cosine` similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used . Past studies to improve CF performance typically used additional information other than users` evaluations such as demographic data. Some studies applied SNA techniques as a new similarity metric. This study is novel in that it used SNA to separate dataset. This study shows that performance of CF can be improved, without any additional information, when SNA techniques are used as proposed. This study has several theoretical and practical implications. This study empirically shows that the characteristics of dataset can affect the performance of CF recommender systems. This helps researchers understand factors affecting performance of CF. This study also opens a door for future studies in the area of applying SNA to CF to analyze characteristics of dataset. In practice, this study provides guidelines to improve performance of CF recommender systems with a simple modification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligence and Information Systems

Lead the way for us

Journal: Journal of Intelligence and Information Systems	Publication Date: Jun 30, 2014
Citations: 4

Similar Papers

Social Network Analysis for the Effective Adoption of Recommender Systems
...
-
, et. al. ...
01 Jan 2010
01 Jan 2010

The Power of Social Network Construction and Analysis for Knowledge Discovery in the Medical Referral Process
Wadhah Almansoori ... Reda Alhajj
Journal of Organizational Computing and Electronic Commerce | VOL. 24
Wadhah Almansoori, et. al.Wadhah Almansoori ... Reda Alhajj
03 Apr 2014
Journal of Organizational Computing and Electronic Commerce | VOL. 24

Recommender systems using cluster-indexing collaborative filtering and social data analytics
Kyoung-Jae Kim ... Hyunchul Ahn
International Journal of Production Research | VOL. 55
Kyoung-Jae Kim, et. al.Kyoung-Jae Kim ... Hyunchul Ahn
07 Feb 2017
International Journal of Production Research | VOL. 55

A Literature Review and Classification of Recommender Systems on Academic Journals
...
-
, et. al. ...
01 Mar 2011
01 Mar 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

소셜 네트워크 분석 기법을 활용한 협업필터링의 특이취향 사용자(Gray Sheep) 문제 해결

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligence and Information Systems