Abstract
AbstractAs the demand for recommendation systems that select and show a lot of information existed on the Internet based on various criteria is growing more and more, the related technologies have also advanced. Among them, the technique, which analyzes customer preference information to measure similarity between customers or between items and recommends items based on it, is called the collaborative filtering. Since it has an advantage that the more amount of data the more exact recommendation result and the better performance, it has been used in various fields. However, there are also some limits. The data sparsity problem, in which the recommendation system’s performance is reduced if there is no sufficient preference information, and the scalability one, in which operation time increases exponentially as the amount of data becomes larger, are considered typical limits. Although studies to improve these limits have been continued, it is needed more practical studies. Therefore, this paper proposed the collaborative filtering technique that improves the limits of the collaborative filtering recommendation technique’s data sparsity and scalability through two-stage clustering. First, it studied how to improve the data sparsity problem. It proposed a method that uses customer’s basic information data to do clustering and then predicts specific customer’s preference scores based on the cluster’s preference information to fill the customer versus item matrix. Second, it proposed a method that reduces data space through clustering to improve the scalability problem. Finally, it verified through experimentation how much the collaborative filtering technique’s performance is improved by the method proposed in this paper. By improving the data sparsity and scalability which is an endemic problem of the collaborative filtering recommendation system through this study, it could recommend more exactly and quickly even if customer preference information is insufficient or there is a large amount of data.KeywordsData sparsityData scalabilityCollaborative filteringRecommendationK-means clustering
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have